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at  the  publisher. 


Accesion  For 

NTIS 

CRA&I 

X 

DT)C 

7A3 

□ 

Uficr.nc 

ji.ced 

□ 

Jjstiiic 

Ulon 

By 

Di-t.  ib 

U''  1 

r-  , 
/-'■ 

;y  Codes 

t  \  .. j  !i  v 

0  !  Of 

Dist 

Ad 

L _ 1 

qualify  expected  i 


* 


The  Bernoulli  Salesman  Problem: 
Asymptotic  Analysis 


L.M.  Whitaker 

Harriman  School  for  Management  and  Policy,  SUNY  at  Stony  Brook,  Stony  Brook,  NY 

and 

D.C.  Llewellyn 

School  of  Industrial  and  Systems  Engineering,  Georgia  Institute  of  Technology,  Atlanta,  GA 

Abstract 

In  this  paper,  we  present  a  probabilistic  analysis  of  the  time  versus  solution  quality 
tradeoff  of  different  implementations  of  a  basic  sequential  edge  exchange  procedure 
for  a  Traveling  Salesman  Problem.  Our  TSP  will  be  on  a  complete  graph  with  edge 
weights  assigned  independently  and  identically  according  to  a  Bernoulli  distribution. 
One  implementation  of  the  procedure  is  a  generalization  of  the  Lin-Kernighan  heuristic. 
For  this  implementation,  we  find  asymptotic  performance  guarantees  which  decrease 
geometrically  as  the  depth  of  the  search  increases.  The  basic  search  procedure  may 
be  implemented  using  a  2-change  neighborhood  structure.  This  enables  us  to  find  an 
asymptotic  performance  guarantee  for  a  2-opt  procedure. 

Key  Words:  Traveling  Salesman,  Local  Improvement,  Random  Graphs. 


1  Introduction 


We  will  be  investigating  a  random  TSP  which  we  designate  as  the  Bernoulli  Salesman 
problem.  This  will  be  defined  as  the  problem  of  finding  a  tour  of  smallest  value  on  a  complete 
graph,  whose  edge  weights  are  assigned  identically  and  independently.  The  weight  of  an  edge 
follows  a  Bernoulli  distribution,  with  the  probability  an  edge  has  weight  0  being  po(n).  We 
require  that  pa(n)  —  c/n,  c*  <  c  <  1.1  log  n,  where  c*  is  a  large  constant.  (At  c  =  1.1  log  n, 
a  tour  of  length  0  almost  always  exists,  which  could  be  found  by  our  methods,  as  well  as 
those  of  other  authors  (2], [4].) 

We  investigate  how  well  local  optimization  can  perform  on  this  problem.  We  will  use 
an  exchange  method  which  preserves  tour  feasibility  after  every  pair  of  exchanges.  Different 
implementations  of  this  procedure  yield  a  2-opt  procedure  and  the  Lin-Kernighan  heuristic  . 
A  '2-opt  local  search  algorithm  uses  a  2-change  neighborhood  where  two  tours  are  neighbors 
if  they  share  all  edges  of  a  tour  except  two.  Exchanging  two  edges  in  a  tour  for  two  new 
edges  leads  to  a  neighboring  tour.  The  Lin-Kernighan  algorithm  (denoted  here  as  the  LK 
algorithm)  does  a  series  of  edge  exchanges;  it  is  a  variable  k-change  heuristic  designed  to 
avoid  local  optima  which  would  trap  other  procedures  like  the  2-opt. 

A  related  procedure  is  the  algorithm  HAM,  used  in  Frieze  [4j,  [2]  to  find  Hamiltonian 
Cycles.  This  algorithm  looks  for  Hamiltonian  Cycles  on  a  random  graph.  It  is  based  upon  the 
extension  and  rotation  idea  usually  credited  to  Posa  [11].  The  algorithm  finds  a  Hamiltonian 
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Cycle  on  this  random  graph  with  probability  which  approaches  the  probability  the  cycle 
exists.  We  use  techniques  based  partly  on  this  analysis  to  find  asymptotic  guaranteed  tour 
values  for  different  implementations  of  our  edge  exchange  procedure. 

Very  few  theoretical  results  have  been  found  for  local  search  procedures.  Kern  [7]  has  a 
nice  result  for  the  Euclidean  based  TSP,  based  on  a  uniform  distribution  for  the  city  locations. 
He  showed  that  with  high  probability,  2-opting  will  find  a  local  optimum  in  polynomial  time. 
Unfortunately,  this  result  does  not  address  the  quality  of  the  local  optimum  found. 

Most  TSP  algorithms  that  yield  to  probabilistic  analysis  have  the  same  underlying  theme. 
The  problem  is  split  up  into  small  subproblems  which  are  solved  to  optimality  by  some  brute 
force  method  (like  dynammic  programming).  Then,  these  small  subtours  or  paths  are  hooked 
together  in  some  way  which  does  not  cause  the  tour  value  to  increase  at  an  explosive  rate. 
Examples  of  these  algorithms  are  Karp’s  disection  algorithm  [6],  the  patching  algorithm  for 
the  asymmetric  TSP  [8],  Frieze’s  TSP  algorithm  [4],  and  Steele’s  directed  TSP  algorithm  [12]. 
All  results  of  this  type  are  asymptotic,  they  only  hold  when  n  — ►  oo.  The  asymptotic  results 
of  these  algorithms  in  some  cases  give  tours  that  converge  to  the  optimal  tour  value.  However 
we  have  very  little  information  that  suggests  these  algorithms  perform  well  in  practice. 

The  LK  procedure  has  good  empirical  backing.  HAM  performs  well  in  the  asymptotic 
sense.  We  tie  these  two  sets  of  information  together.  In  this  paper,  we  present  an  asymptotic 
analysis  of  a  procedure  which  generalizes  2-opting  and  the  LK  algorithm.  Specifically,  we 
find  an  asymptotic  guarantee  of  solution  quality  as  a  function  of  the  time  requirement.  This 
guarantee  shows  a  geometric  improvement  with  respect  to  the  amount  of  time  we  are  willing 
to  spend.  (See  Theorem  5.2  and  Corollary  5.1.)  We  a Iso  present  an  asymptotic  guarantee 
for  a  2-opt  procedure 

In  our  next  paper,  we  complete  the  analysis  by  presenting  empirical  and  nonasymptotic 
results  for  our  local  search  procedure.  Since  local  search  procedures  are  so  widely  used,  these 
results  could  be  useful  in  practice,  as  well  as  theoretically  interesting  and  pleasing. 

The  organization  of  this  paper  is  as  follows.  In  Section  2  we  introduce  the  important 
notations!  assumptions  of  our  work.  In  Section  3  we  describe  our  algorithm  in  detail.  We 
analyze  the  algorithm  in  Section  5.  In  Section  6  we  show  how  our  algorithm  car.  be  modified 
to  be  a  strict  2-opt  procedure,  and  how  to  handle  aymmetric  instances.  Several  proofs  are 
deferred  to  the  appendix. 

2  Notation 

For  ease  of  presentation  the  basic  notation  used  will  be  given  here. 
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The  set  of  ail  complete  graphs  on  vertices  V  =  {1,2,  ..,n} 
with  edge  weights  assigned  independently  and 
identically,  and  /’(edge  e  has  weight  0)  =  po(n)  =  c/n. 

A  random  graph  in  Q(n,c/n). 

c*  <  c  <  1.1  log  n,;c*  is  a  large  constant  (c(=  c(n))  may  be  a  funtion  of  n) 
The  set  of  edges  of  G,  where  G  =  (V,  E)  is  distributed  like  Gn>c/n. 

{e  €  E  :  edge  e  has  weight  0}. 

An  edge  of  weight  0 
An  edge  of  weight  1. 

The  number  of  0-edges  incident  with  vertex  u,  called  the 
0-degree  of  v. 

For  S  C  V,  No(S,G)  =  {w  &  S  :3v  £  S  such  that  (v,w)  6  Eo }. 

(No(S,G)  is  the  neighborhood  of  5  using  only  0-edges). 

The  random  graph  G„tC/„  has  property  Q  almost  always  (a.a.)  if 
lim^oo  P{GniC/n  has  property  Q)  =  1. 

The  expected  value  of  the  random  variable  X. 

Tour  value  found  by  algorithm  searching  to  a  depth  of  6. 

An  upper  bound  for  the  sequence  of  random  variables  Xn, 
where  limn—oo  P{Xn  <  X)  =  1. 

A  lower  bound  for  the  sequence  of  random  variables  Xn, 
where  limn_oo  P(Xn  >  20  =  1- 
A  function  that  is  O(logn) 

A  small  constant  such  that  ^  >  {  -  +  ^e-2c/3) 

3  Algorithm  BTS 

Our  algorithm  uses  sequential  edge  exchanges  to  search  for  lower  valued  tours.  A  sequen¬ 
tial  edge  exchange  is  simply  the  exchanging  of  one  edge  for  another  adjacent  to  the  first. 
This  will  become  clear  with  the  introduction  of  the  algorithm,  and  graphs  which  we  us-  to 
track  the  different  steps  of  the  algorithm. 

Recall  that  we  are  given  a  complete  graph  G  on  n  vertices,  numbered  1,2, ... ,n,  with 
edge  weights  0  and  1.  An  initial  tour  is  given  which  may  be  chosen  in  any  way  independent 
of  our  algorithm.  Let  To  =  {e  6  E  :  e  is  a  0-edge  in  this  initial  tour}.  Call  this  the  first 
Base  Set.  In  general,  we  have  this  definition. 

Definition:  A  Base  Set  is  the  collection  of  0-edges  that  is  a  subset  of  a  tour.  The  Base 
Set  which  is  input  to  iteration  k  will  be  denoted  by  i. 

During  the  first  iteration  of  BTS,  the  algorithm  tries  to  find  a  set  of  0-edges  of  size 
|To|  +  1  =  |7\|.  The  set  must  be  a  subset  of  some  tour,  so  it  may  contain  no  cycles  and  no 
vertices  may  have  0-degree  >  3.  (When  we  refer  to  no  cycles,  we  always  mean  no  cycles  of 
length  <  n.  If  we  have  a  cycle  of  length  n,  BTS  has  found  a  tour  of  length  0).  This  leads  to 
the  following  definition. 
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Figure  1:  Tour  and  underlying  Base  Set,.  Edge  weights  are  on  the  tour  T. 

Definition:  An  Acceptable  Edge  Set  (AES)  is  a  subset  of  Eq  such  that  the  graph  induced 
by  E0  has  no  cycles  of  length  <  n  and  no  vertices  of  0-degree  >  2.  Usually  the  AES  will  be 
subscripted  to  denote  how  deep  we  have  searched  within  an  iteration  to  find  it. 

Notice  that  by  definition,  an  AES  will  be  a  subset  of  a  tour.  In  fact,  our  base  sets  will 
be  acceptable  edge  sets.  The  algorithm  will  only  keep  track  of  this  type  of  edge  set.  During 
a  general  iteration  k ,  BTS  is  given  a  Base  Set  (which  must  be  an  acceptable  edge  set),  7*_i, 
and  it  tries  to  find  an  acceptable  edge  set,  7*,  of  size  1  +  I7fc_.il  (which  will  then  be  a  new 
Base  Set). 

Define  the  graph,  G(AESo),  (which  is  dependent  on  the  initial  tour  BTS  will  try  to 
improve  upon)  as  follows.  This  graph  has  the  same  vertex  set  as  G„tC/n  and  edge  set  AES0  = 
Tq.  In  Figure  1,  the  first  graph  shows  the  initial  tour  T.  The  second  shows  the  graph 
G{AESq).  A  general  stage  of  the  algorithm  will  consist  of  either  adding  an  edge  to  this  set, 
or  performing  one  sequential  edge  exchange.  The  first  operation  finds  the  existence  of  a  lower 
valued  tour.  (For  example,  if  edge  (x,u7)  is  a  0-edge  in  Figure  1,  we  may  add  it  to  the  set.) 
The  second  gives  a  new  acceptable  edge  set  with  cardinality  |A£5o|-  This  set  must  also  be 
a  subset  of  some  tour.  This  set  is  denoted  AES\  since  it  is  one  sequential  edge  exchange 
away  from  the  Base  Set  AESq. 

Figure  2  represents  a  series  of  sequential  edge  exchanges  that  result  in  a  lower  valued 
tour.  Each  graph  is  based  on  a  different  AES,  and  the  sets  are  related  as  follows:  AESd+i  = 
AESdV  (x,*,yrf)\(y<i,Zd+i)  d=0,l,  and  AESj  =  A£5jU  (xj,ya).  The  edges  in  the  sequential 
exchanges  form  a  path.  Look  at  the  final  AES.  Since  this  set  is  also  a  subset  of  a  tour,  no 
edge  needs  to  be  removed. 

To  help  in  future  analysis,  we  need  to  further  categorize  the  edges  in  the  sequential 
exchanges. 

Definition:  An  alternating  path  (ap)  corresponding  to  acceptable  edge  set  AESs  is  a 
path  with  2 6  0-edges.  Every  edge  on  the  path  is  part  of  at  least  one  sequential  edge  exchange 
used  to  find  some  AESd ,  1  <  d  <  6.  The  path  alternates  between  edges  BTS  adds  to  an 
AES,  and  those  it  removes.  The  path  begins  at  a  vertex  which  has  degree  less  than  two  in 
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Figure  2:  Graphs  of  acceptable  edge  sets,  AES3  implies  a  lower  valued  tour, 
the  current  Base  Set. 

In  Figure  2  an  ap  corresponding  to  AES-x  has  vertices  (x0, y0, Xi,  yi,  x2). 

Definition:  A  proper  alternating  path  (pap)  corresponding  to  acceptable  edge  set  AES$ 
is  an  alternating  path  extended  by  one  0-edge  not  in  AESs-  Notice  that  a  pap  completes  an 
iteration  since  it  finds  a  larger  acceptable  edge  set. 

In  Figure  2,  the  edge  set  {(x0,yo),(yo,Xi),(xi,yi),(yi,a:2),(x2,y2)}  forms  a  pap. 

We  can  see  that  finding  a  pap  ensures  us  of  finding  a  lower  valued  tour.  BTS  will  search 
for  these  paps.  Notice  that  our  representation  of  the  AES  graphs  (See  Figures  1,  2)  have 
the  property  that  vertices  on  the  left  side  of  the  graphs  are  incident  to  at  least  one  1-edge 
in  the  Base  Set.  A  pap  must  start  and  end  with  this  set  of  vertices.  Thus  we  leave  them  on 
the  same  side  of  the  graph  until  we  find  a  new  and  larger  Base  Set.  When  a  pap  is  found, 
BTS  resets  the  final  AES  to  be  AESo  (and  we  update  our  graph),  and  begins  again. 

Searching  for  paps  is  the  main  idea  of  the  algorithm  BTS.  This  searching  is  done  by  the 
procedure  Search.  Before  presenting  Search,  we  define  the  variables  it  uses. 

Definition:  Given  any  Base  Set,  APq  is  the  set  of  vertices  which  have  0-degree  <  1. 
(These  vertices  may  start  the  ops). 

Definition:  Given  an  AES  and  a  vertex  v  of  0-degree  <  1,  we  define  a  set  of  vertices, 
Y  .  A  vertex  y  €  Y,  is  such  that  y  6  No(v)  and  (v,y)  £  AES.  (The  0-edges  (v,y)  :  y  €  Y 
may  be  added  to  an  ap  ending  at  v). 

Definition:  Given  an  AES  and  a  vertex  y  €  K,  we  define  a  set  of  vertices  X  .  A  vertex 
x  €  X  is  such  that  x  €  No (y)  and  (y,x)  €  AES.  (The  0-edges  (y,x)  :  x  6  X  are  candidates 
to  be  removed  from  the  AES  and  thus  added  to  the  ap). 

Definition:  6max  is  the  maximum  depth  to  which  BTS  will  search. 
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The  recursive  procedure  Search  is  presented  next.  In  a  call  to  the  procedure  Search,  8 
designates  the  current  depth  of  the  search,  AES  is  the  current  edge  set,  and  v  is  the  vertex 
the  algorithm  will  search  from  to  extend  the  current  ap.  Initially  v  =  0,  since  we  have  no 
designated  starting  vertex. 

Procedure  Search ( 8,  AES,  v) 

Begin 

If  v  =  0  and  8  =  0  (starting  vertex  is  not  designated) 

Then  Compute  AP0  (vertices  which  may  start  aps) 

For  v  €  APq : 

Search(0,  AES,v)  (search  from  all  vertices  of  0-degree  <  2) 

Else  (v  0,  and  vertex  to  search  from  is  specified) 

While  8  <  6max 

Compute  Y  (set  of  next  possible  vertices  on  path) 

For  y  6  Y: 

If  AES  U  ( v ,  y)  is  an  acceptable  edge  set 

Then  Search(0,  AES  U  (v,y),0)  (Start  new  iteration) 

Else 

Compute  X  (set  of  next  possible  vertices  on  path) 

For  x€X: 

Search(6  +  1,  AES  U  (u,  y)\(y ,  x),  x) 

Endif 

Endif 

End 

The  flowchart  (Figure  3)  shows  a  nonrecursive  representation  of  the  algorithm.  A  new 
iteration  occurs  when  a  larger  Base  Set  is  found,  and  the  algorithm  in  effect  starts  over. 

Figure  2  shows  an  iteration  of  BTS.  The  iteration  stops  when  BTS  finds  a  larger  Base 
Set.  In  the  procedure,  this  occurs  when  the  call  Search(0,  AES  U  (v,y),0)  is  made.  The 
vertices  on  the  pap  are  labelled  xj  and  yj,  according  to  the  depth  at  which  the  edges  on 
the  path  were  added  or  removed.  The  different  graphs  in  the  figure  show  the  progression  of 
acceptable  edge  sets  seen  until  BTS  finds  a  new  Base  Set,  which  is  AES3 

4  Implementation  of  BTS 

The  algorithm  BTS  simply  makes  selected  calls  to  the  procedure  Search.  The  time  required 
by  Search  increases  greatly  as  8max  increases.  When  the  current  Base  Set  has  a  small  cardi¬ 
nality,  we  can  reason  that  we  can  probably  find  a  pap  using  a  small  8max.  This  notion  stems 
from  the  fact  that  a  small  sized  Base  Set  means  there  are  many  vertices  which  may  be  used 
to  start  and  end  paps.  For  this  reason  as  well  as  the  time  requirements,  we  will  implement 
BTS  using  an  increasing  sequence  of  5ma*. 

BTS  will  call  Search  using  <5max  =  1  first.  When  the  procedure  stops,  we  increase  8max 
to  2  and  keep  going.  Additional  increases  in  the  maximum  depth  of  Search  are  made  until 
BTS  has  searched  to  the  maximum  depth  we  are  willing  to  allow. 
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Figure  3:  Nonrecursive  Flowchart  of  Algorithm  BTS 
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In  addition,  BTS  makes  one  more  change  in  its  calls  to  Search.  Whenever  BTS  reaches 
an  AES  at  depth  |6ma*/2J,  it  will  start  a  new  ap.  This  means  BTS  want  to  use  this  AES  as 
a  Base  Set,  and  have  the  option  of  starting  new  aps  from  any  vertex  of  0-degree  <  1  in  the 
AES.  We  can  accomplish  this  modification  by  slightly  changing  the  procedure  Search.  Let 
SplitSearch(6,  AES,  v,d)  be  the  modified  procedure  Search. 

Procedure  SplitSearch(6,  AES,  v,d) 

Begin 

If  v  —  0  and  (£  =  0  or  6  =  [d/ 2J) 

(starting  vertex  is  not  designated) 

Then  Compute  APq  (vertices  which  may  start  aps) 

For  v  €  APq: 

SplitSearch(0,  AES,  v,  d)  (search  from  all  vertices  of  0-degree  <  2) 

Else  (v  ^  0,  and  vertex  to  search  from  is  specified) 

While  6  <  d 

Compute  Y  (set  of  next  possible  vertices  on  path) 

For  y  6  Y: 

If  AES  U  (u,  y)  is  an  acceptable  edge  set 

Then  SplitSearch(0,  AES  U  (u,y),0,  d)  (Start  new  iteration) 

Else 

Compute  X  (set  of  next  possible  vertices  on  path) 

For  x  E  X: 

If  6  =  |d/2j  -  1 

Then  SplitSearch(6  +  1,  AES  U  (v,  y)\(y,  x),  0,  d) 

Else  SplitSearch(5  +  1,  AES  U  ( v ,  y)\(y,  x),  x,  d) 

Endif 

Endif 

Endif 

End 

The  algorithm  BTS  may  now  be  written  as  follows: 

Algorithm  BTS 

Input  the  initial  tour  and  calculate  Tq.  Input  6max,  the  maximum  depth  to  which  BTS  may 
search. 

Do  for  d  =  1  to  6max 
SplitSearch(0,  AES ,  v,  d) 

Continue 

Output  the  current  AES.  Find  a  tour  containing  the  AES  and  output  it  as  weil. 

STOP 

Note  that  ANY  tour  that  BTS  finds  containing  a  certain  AES  has  value  n  —  |A£5|.  This  is 
true  since  any  extra  0-edge  which  could  be  found  by  hooking  the  0-edge  components  together 
would  have  been  found  as  a  pap  of  length  1. 
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We  will  forgo  the  explanation  of  the  need  for  the  modification  to  Search  until  the  next 
chapter,  which  presents  an  asymptotic  analysis  of  BTS.  The  modification  has  intuitive  ra¬ 
tionale  and  is  also  needed  to  vastly  improve  the  bounds  the  analysis  gives. 

The  time  required  for  BTS  is  dominated  by  the  time  it  takes  to  search  for  new  acceptable 
edge  sets.  Angluin  and  Valiant  [1]  have  developed  a  data  structure  that  enables  us  to  find 
a  new  acceptable  edge  set  (adding  an  edge  and  removing  an  edge)  in  time  O(logn).  Since 
the  maximum  0-degree  of  a  vertex  is  4logn  (a.a.),  we  have  0(n2(log  n)Smax)  possible  AES 
operations  (a.a.)  per  iteration.  The  n2  comes  from  the  fact  that  we  rest  rt  the  searching 
after  depth  6max/ 2.  Thus  the  overall  time  requirement  is  a.a.  O'’n3(4logn)5mi11  logn).  This 
shows  that  even  for  large  Smax,  say  Smax  <  the  time  requirement  is  quite  small  (note 

lot  n 

that  logn'o*1®*"  =  n). 

Unfortunately,  given  as  much  time  as  we  choose  (6max  very  very  large),  we  still  may  not 
be  able  to  find  an  optimal  tour.  This  follows  because  of  the  sequential  ordering  that  we 
talked  about  earlier,  that  is,  there  may  be  tours  that  we  cannot  find  using  sequential  edge 
exchanges  (see  [9]).  However,  we  have  evidence  already  that  our  algorithm  will  perform 
well.  Our  algorithm  is  is  a  generalization  of  HAM  and  of  the  LK  algorithm.  HAM  has 
nice  theoretical  backings.  The  class  of  graphs  for  which  HAM  is  analyzed  has  the  property 
that  the  proportion  of  instances  for  which  HAM  finds  a  Hamiltonian  Cycle  approaches  the 
probability  a  Hamiltonian  Cycle  exists  -  given  a  random  graph  G„,m  on  n  vertices  and  with 
m  edges: 


Hmn-.coPr(Gnim is  Hamiltonian)  =  { 


if  c„ 

if  Cn 

if 


— oo 


c 


oo. 


This  implies  that  sequential  edge  exchanges  will  probably  take  care  of  most  of  the  searching 
we  need  to  do.  The  LK  algorithm  has  nice  empirical  backing,  so  the  algorithm  should  do 
well  even  on  small  problems.  This  also  demonstrates  the  power  of  sequential  exchanging. 


5  Analysis  of  BTS 


For  the  analysis,  it  is  convenient  to  think  of  BTS  in  terms  of  separate  iterations.  For 
the  analysis,  we  assume  that  when  BTS  finds  a  larger  Base  Set,  it  throws  out  all  other 
information  and  starts  over  again. 

This  section  gives  am  asymptotic  performance  analysis  of  BTS.  Earlier  we  defined  the 
parameter  for  our  Bernoulli  distribution  to  be  c*  <  c/n  <  1.1  logn.  (If  c  >  1.1  logn  we  know 
that  a  0-length  tour  can  be  found  a.a.)  Many  of  the  graph  properties  which  we  will  use  call 
for  c*  to  be  restricted.  We  will  need  to  have  c*  >  10  always,  and  will  at  times  restrict  c*  to 
be  much  larger  (>  400). 


5.1  Analysis  Overview 


el 


Figure  4:  Graphs  of  acceptable  edge  sets,  some  edges  BTS  may  add  are  dashed 


Here  we  give  an  outline  of  the  steps  we  use  to  analyze  BTS.  We  will  give  an  intuitive 
explanation  of  these  steps. 

Look  at  Figure  4.  Starting  at  vertex  x0 ,  BTS  looks  for  a  new  edge  to  add.  If  BTS  adds 
an  edge  such  as  ei,  a  new  Base  Set  is  formed  and  BTS  starts  a  new  iteration.  If  an  edge 
such  as  ex  is  added,  BTS  must  remove  an  edge  to  form  a  new  AES.  Although  BTS  did  not 
find  a  larger  Base  Set,  it  gave  itself  new  opportunites  by  finding  xx.  BTS  may  now  search 
from  vertex  xx.  In  this  stage,  BTS  finds  an  edge  e2  which  implies  a  new  vertex  x2,  and  at 
the  next  stage  BTS  finally  finds  a  new  Base  Set. 

A  definition  will  help  us  here. 


Definition:  An  ap-reachable  vertex  is  any  vertex  used  by  BTS  from  which  to  search  for 
a  new  edge  to  enter  an  acceptable  edge  set.  (These  are  the  x  vertices  in  the  aps.)  If  this 
vertex  is  found  at  depth  6  during  some  iteration,  then  it  will  be  referred  to  as  a  6-reachable 
vertex.  The  collection  of  all  vertices  which  are  ^-reachable  will  be  denoted  by  AP$. 

It  is  clear  that  the  “deeper”  BTS  is  allowed  to  search  (the  larger  the  Smax ),  the  better 
the  chance  it  has  to  find  a  new  (larger)  Base  Set.  This  is  a  direct  result  of  BTS  being  able  to 
find  more  vertices  which  it  sees  as  <5- reachable  vertices.  Thus  the  first  step  in  the  analysis  of 
BTS  will  be  to  find  a  lower  bound  for  the  number  of  these  vertices  (as  a  function  of  £max). 
(To  make  sure  we  find  all  vertices,  we  assume  that  BTS  fails  to  find  a  larger  base  set  during 
the  current  iteration.) 

Consider  one  new  vertex  that  is  discovered  by  BTS  as  an  6-reachable  vertex.  (See  vertex 
x2  in  Figure  4,  for  example).  [This  new  vertex  id  could  form  a  new  Base  Set  with  nearly 
all  vertices  adjacent  to  at  most  1  edge  in  the  current  AES,  if  a  0-edge  connected  them  (for 
example  0-edge  (x2,u)  implies  a  new  Base  Set,  while  0-edge  (x2,  w)  does  not).]  If  BTS  fails 
during  some  iteration  k,  many  edges  it  sees  of  the  above  form  (like  (x2,u)  or  ex),  must  have 
weight  1,  or  BTS  would  use  these  edges  to  form  a  new  Base  Set.  Let  us  call  these  edges, 
necessary  l-edges,  for  they  are  necessary  to  BTS’  failure  at  iteration  k.  This  is  made  formal 
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in  the  following  definition. 

Definition:  A  necessary  1-edge  is  a  1-edge,  e,  for  which  there  exists  an  ap  found  by  BTS 
that  extended  by  e  (changed  into  a  0-edge)  would  form  a  pap.  That  is,  it  is  a  1-edge  which 
if  changed  to  a  0-edge  would  imply  a  pap. 

Definition:  'Pf  is  the  set  of  necessary  1-edges  found  at  depth  8. 

The  second  step  in  this  analysis  is  to  find  a  lower  bound  for  the  number  of  necessary 
1-edges  BTS  sees  at  a  certain  iteration,  assuming  it  fails  to  improve  during  this  iteration. 
(For  ease  of  presentation,  we  refer  to  BTS  failing  during  an  iteration  if  it  is  the  final  iteration 
of  the  algorithm.) 

The  third  step  of  the  analysis  is  to  find  an  expression  for  the  number  of  0-edges  BTS 
sees  in  the  Base  Sets  T *,  0  <  k  <  k  —  1.  If  we  are  to  find  an  expression  for  the  probability 
BTS  fails  during  iteration  «,  we  must  have  information  about  the  number  of  0-edges  BTS 
has  already  seen.  In  fact,  there  is  an  inverse  relationship  between  these  two  items.  The  more 
0-edges  BTS  sees,  the  higher  the  probability  BTS  has  seen  all  0-edges,  so  the  probability  it 
fails  must  increase. 

Now,  in  order  to  put  all  of  this  together  and  analyze  BTS,  we  need  one  final  bit  of 
information.  Given  that  BTS  fails  during  iteration  /c,  we  look  for  a  set  of  0-edges  that  do 
not  influence  the  outcome  of  BTS.  All  that  is  necessary  to  find  such  a  set  is  to  find  0-edges 
that  are  not  in  any  of  the  Base  Sets  T*,  0  <  k  <  k  —  1.  Any  of  these  edges  could  be  changed 
to  1-edges,  and  BTS  would  still  see  all  of  the  same  Base  Sets.  This  is  the  fourth  step  of  our 
analysis. 

Here  is  a  brief  idea  of  how  we  put  these  steps  together.  Given  a  particular  instance  of 
G,  randomly  choose  a  subset,  X,  of  the  0-edges  of  G.  Now  run  BTS  on  Gx  where  the  edges 
in  X  have  been  changed  to  1-edges.  Suppose  that  BTS  run  on  this  new  graph  fails  during 
iteration  k.  Assume  that  X  did  not  influence  the  outcome  of  BTS  in  iterations  1,2,  ..k  —  1. 
Since  BTS  has  failed  at  iteration  k ,  BTS  has  seen  many,  many  necessary  1-edges  during  this 
iteration.  If  there  are  enough  necessary  1-edges,  there  will  be  a  high  probability  that  at  least 
one  of  them  is  in  X.  Thus  BTS  will  probably  not  fail  on  G,  even  though  it  did  fail  when  a 
few  0-edges  of  G  were  changed  to  1-edges. 

For  reference,  here  are  the  steps  we  will  take: 

Assume  BTS  fails  during  iteration  n: 

(1)  Find  a  lower  bound  for  the  number  of  vertices  discovered  as  a  ^-reachable  vertex  (for 
0<6<6max-  1). 

(2)  Given  (1),  find  a  lower  bound  for  the  number  of  necessary  1-edges  BTS  sees  during 
iteration  k. 

(3)  Find  an  upper  bound  for  the  number  of  0-edges  seen  in  the  Base  Sets  7fc,  0  <  k  <  k  —  1. 

(4)  Establish  the  existence  of  X,  a  set  of  0-edges  that  does  not  influence  the  outcome  of 
BTS. 

(5)  Put  everything  together  for  the  final  result. 
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5.2  Graph  Properties 


Here  we  collect  some  graph  properties  which  we  will  need  for  the  five  steps  of  our  analysis. 

Lemma  5.1  [3]  Let  G  —  Gn,c/n  and  let  vertex  v  be  ‘small’  if  d0(v)  <  c/10  and  ‘large’  oth¬ 
erwise.  Let  SMALL,  LARGE  be  the  sets  of  small  and  large  vertices  respectively.  (A  small 
vertex  has  few  incident  0-edges) 

Let  W(E0)  =  Wx(Eq)  U  W2{E0)  U  W3{Eq)  U  W4(Eq),  where 

Wk(E0)  =  {v  :  v  is  small  and  there  exists  a  small  w  such  that  v  and  w  are  joined  by  a 
path  of  length  k  comprised  only  of  0-edges  }.  (v=w  is  allowed  for  k=3,4 )■ 

Let  l  >  7  be  fixed.  Then  for  c  >  20 (l  +  1)  log(£  +  1),  G  satisfies  the  following  (a. a.): 

|{u  6  V  :  dQ(v)  <  c/10  +  1 } |  <  ne-2c/3 

do(v)  <  4 logn  for  all  v  €  V; 

\W(E0)\  <  c*e~4c'3n; 

0  S  C  V,  J5|  <  n/2i  and  S  C  LARGE  implies  JiV0(5)J  >  l\S\; 


Next  are  some  new  lemmas. 

Lemma  5.2  Let  S{  —  {v  :  vertex  v  has  0-degree  :},  then  G  a.a.  satisfies  the  following: 

(5.2.1)  ||50|  -  n(l  -  c/n)n-l|  <  nl/ 2  logn 

(5.2.2)  ||5i|  —  (n  —  l)c(l  —  c/n)n~2 J  <  n1^2  logn 

(5.2.3)  |(|50|  +  |Si|)  —  n(l  -c/n)n_l  -  (n  -  l)c(l  -  c/n)n~2 1  <  n1/2logn 


This  lemma  is  proved  using  Chebyshev’s  inequality,  see  appendix  for  proof. 


Lemma  5.3  Given  Gn<c/n  =  ( V,E ),  let  S  C  V.  Let  Eo(S)  =  {e  =  (v,w)  :  v,w  6  5  and 
(v,w)  is  a  0-edge  }.  Then  for  all  0  ^  5  C  V,  G  a.a.  satisfies  the  following: 

If\S\  =  n/j,  where  j  +;logj  <  c/10  then  |£o(S)|  >  £xp[|^o(^)|l/5 

To  prove  this  lemma  use  the  Markov  Inequality,  Pdal  >  1)  <  £^xp(|a|)/l.  Let  a  =number 
of  sets  of  size  jS|  such  that  |£70(S’)|  <  £xp[|£’0(5)|]/5.  See  appendix  for  full  proof. 
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5.3  Main  Steps  of  Analysis 

The  next  lemma  concerns  the  number  of  alternating  paths  of  a  given  length  that  BTS  will 
find  during  one  iteration.  We  will  try  to  bound  the  number  of  vertices  in  AP&,  (for  each 
6,  0  <  6  <  Smax  -  1)-  This  will  accomplish  the  first  step  in  our  analysis. 

Before  we  start,  recall  that  TK.X  was  the  Base  Set  that  BTS  failed  to  improve  upon  (since 
we  assume  that  BTS  fails  during  iteration  k  ). 

Recall  if  v  G  AP$,  then  it  is  in  some  AESs •  Notice  that  |A/ol  is  the  set  of  vertices 
with  0-degree  <  2  in  the  Base  Set  TK-\. 

We  can  find  many  bounds  for  the  size  of  APs,  the  following  lemma  gives  one.  It  is  worth 
noting  that  this  lemma  really  bounds  the  number  of  x$  seen  on  one  alternating  path.  Recall 
that  we  let  BTS  restart  at  depth  [Smax/ 2J.  To  avoid  confusion  now  we  will  assume  that 
6  <  L^ma*/2J .  Then,  we  will  address  this  restarting  in  the  second  step  of  the  analysis  and 
will  take  into  account  the  added  bonus  of  restarting  our  aps  at  6  =  [6moi/2J .  So  for  now, 
think  of  BTS  searching  on  one  alternating  path  where  6  <  [6max/ 2J. 

Lemma  5.4  Suppose  that  BTS  terminates  during  iteration  k  on  Gn,c/n,  and  that  6  < 
L^max/2J.  Let  t  >  9  be  fixed  and  c/20  >  ((  +  l)log(£  +  1)  (since  we  use  Lemma  5.1). 
The  following  hold  a. a.: 

If\AP0\  >  £ne-2c'3  then: 

Mfi+il  >  (l-^-)\APt\  +  J"«'2‘/3  >  (^V+,1l^ol  +  i»e-2t/3 


as  long  as  \APS\  <  yr  If  \AP&\  >  £  then  \APs+l\  >  (^)(£)- 


Proof: 

The  proof  of  this  lemma  uses  techniques  from  [4]  [5]. 

Consider  edges  (x«,y)  €  E0(G),  where  xs  €  AP$  and  y  G  N0{xg,G),  and  6  <  |£ma*/2j. 
We  keep  6  small  since  BTS  in  effect  spreads  itself  out  up  to  a  depth  of  [<5mtur/2j ,  and  then 
starts  over  again.  Assume  |AP$|  <n/2(. 

If  (x«,y)  f?  AESs  then  BTS  creates  a  new  AES  by  adding  this  edge  and  removing  a 
different  one  incident  with  y.  Such  an  edge  must  exist  since  BTS  fails  during  iteration  k, 
and  actually  two  edges  may  exist.  In  the  latter  case  we  will  choose  only  one  of  these  edges. 
The  edge  we  choose  is  called  (y,xj+i)  and  we  will  refer  to  the  vertex  xs+i  as  Xf+1(i4,y). 
Notice  that  xs+i  is  in  AES$+ 1,  a  new  acceptable  edge  set.  If  (x$,y)  €  AESs ,  then  let 
*4+l(X4,y)  =  xs. 

Notice  that 


|AP5+1|  >  \{xs+x  =  xs+i{xs,y)  .  xs  €  APs,y  €  N0{APs,G),  and  (xs,y)  #  AESs}\ 
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zl  K^i+i  —  x$+x(x$,y)  :  xs  €  ( APg  fi  LARGE),  y  €  Nq(AP$  fl  LARGE)} | 

-\(APsn  LARGE) | 


The  final  inequality  is  true  since  we  need  to  subtract  at  most  ONE  edge  incident  to  each 
xs  that  could  be  in  AES$.  (There  is  at  most  one  of  these  edges  per  is). 

Next,  we  find  an  expression  for  the  number  of  vertices  called  y: 


\N0{APS  n  LARGE) \  <  \{y  6  N0(APg  n  LARGE) :  (y,x«+1(x5,y))  £  T„_i}| 

+  2|{x*+l  :  xs  €  (APg  fl  LARGE),y  6  Nq(AP$  fl  LARGE),  (y,  x«+i)  6  T»_ i}| 


Clearly, 

|{x*+1  =  x5+l(x4,y) :  xs  €  ( APg  n  LARGE),  y  €  N0(APg  0  LARGE)}\ 

>  Kx«+i  •  xs  €  {APg  D  LARGE), y  €  N0(APg  C\  LARGE),  (y,xg+i(xg,y))  6  T«_i}| 

>  \N0(APg  H  LARGE)]  j{y  €  N0(APS  n  LARGE) :  (y,xg+x(xg,y))  g  7^-X}| 

2  2 

Therefore  by  substituting  into  the  first  expression  we  find  that: 

\APs+x\  >  \N°(APs  n^LARGE)\  _  |( APs  n  LARGE)\ 

l{y  €  NojAPg  fl  LARGE) :  ( y,xg+\{xg,y ))  g  T«-i}| 

2 

Let  Yj  —  {v  :  v  6  No(APg  fl  LARGE)  and  vertex  v  was  seen  as  yd  during  iteration  «}. 
Note  that  Yd  is  in  general  a  proper  subset  of  N0(APg  D  LARGE). 

We  also  have  that: 

\{y  €  N0(APg  fl  LARGE) :  (y,x5+1(y))  *  7^_l}| 

<  |{u  :  v  €  APd,orv  €  Yd,  0  <  d  <  6}| 

<  ne-2c'3  +  \{v  :  u  €  H  LARGE)  U  {Yd  D  LARGE)]} \ 


The  first  inequality  holds  since  the  left  hand  side  is  certainly  less  than  the  total  number 
of  vertices  incident  to  all  edges  added  during  iteration  k.  (Edge  (yg,xs+i)  must  have  been 
added  to  some  AES  during  this  iteration).  The  second  inequality  holds  since  we  have  at 
most  ne*Je/3  small  vertices. 
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Next  notice  that  any  large  vertex  in  AP, d  will  also  be  in  APd+2.  Notice  that  yd  will  be 
€  Yd>,  for  all  dl  >  d.  Thus  we  can  complete  our  inequality  with: 

\{y  €  No(APs  n  LARGE)  :  (y,xs+l(y))  ? 

<  „e-2c/3  +  |r4_!|  +  \APs-i  n  LARGE\  -l-  \APS-2  n  LARGE\ 

<  ne-2c'3  +  2\AP6\  +  \APS-i\  +  \APs_2\ 


The  final  inequality  holds  since  there  are  at  most  2  yd/nd(yd,xd+1)  €  T*_i. 


Plugging  in  all  of  this  lets  us  see  that: 

\APs+l\  >  \N«(APs  n  LARGE)\  ne'2C 

2  2 

JAP^d  _ ^  n  LARGE\ 


^  -  mai  -  ^ 


>  ^(MAI  -  n«-^3)  -  -  \AP,\  - 

>  ~\AP,\  -  -~-rie-2^3  - 

2  2  2  2 


l^-a| 

2 


These  final  steps  use  Lemma  5.1.  We  finish  the  2  parts  of  the  lemma  by  using  induction. 
Assume  |AP0|  >  ine~2c^3. 


MAI  >  ^MAol  - 

MAI  >  ^MAI  -  ^ 

=  ^MAI  +  jl^MAI-MAI 
>  ^MAI  +  in«-^ 


l)ne_2c/3] 


Assume  true  for  |AP4|  <  n/2l,  show  true  for  |AP4+ij. 

MA+i|  >  ^Mftl  - 

>  ^Mftl  +  jl^MA-.l  -  MA-.I  -  MA-il 


-  {1  -  l)ne“2c/3] 
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>  l~Y-\AP,\  +  ine-^3  +  illAft-jj  -  fr-e'31'3] 


Assume  BTS  fails  during  iteration  k  on  the  graph  G.  Step  two  of  our  analysis  is  to  find 
an  expression  for  the  number  of  1-edges  BTS  finds  that  would  create  a  pap  if  any  one  of 
the  1-edges  were  changed  to  a  0-edge.  This  is  the  set  of  necessary  1-edges  which  we  defined 
earlier. 

We  have  come  to  the  part  of  this  analysis  where  we  make  use  of  the  fact  that  BTS 
restarts  itself  at  depth  [6mas/2\.  From  Lemma  5.4,  we  know  that  the  largest  sized  APs  we 
can  guarantee  is  (^)(^),  for  any  6.  Earlier  we  saw  that  most  edges  having  one  vertex  in  APs 
and  one  in  AP0  were  necessary.  This  implies  we  can  guarantee  almost  [(^)(^)][/ne-2c/3| 
necessary  edges  (may  have  to  have  6  very  large).  However,  this  turns  out  to  be  not  nearly 
enough  necessary  edges  to  give  us  good  bounds  for  guaranteed  tour  values  that  BTS  will 
find.  This  is  especially  true  if  c  is  large. 

However,  if  we  restart  the  searches  after  |£max/2j,  then  we  may  use  all  vertices  in 
AP[sm<lx/7 j  to  finish  paps.  This  is  true  since  every  vertex  in  this  set  has  0-degree  <  2  in 
some  AES[s„„/2 j-  This  AES  will  be  used  as  a  Base  Set.  Thus  we  have  two  groups  of  ver¬ 
tices,  AP\smaM/ 2j  and  APsmmM,  which  we  may  use  to  guarantee  necessary  edges.  If  BTS  fails 
then  it  will  see  «  j[(^)(^j)]2  necessary  1-edges.  The  formal  statement  and  proof  follow. 

Lemma  5.5  Given  the  conditions  of  Lemma  5-4,  the  following  holds  a. a.: 


I'M  >  mmllyWol’i 
I'M.- 1  > 

for  6max  greater  than  2 


Proof: 

The  general  idea  of  this  proof  is  to  show  that  even  if  BTS  fails,  it  will  still  find  many  pairs 
of  aps.  Each  ap  pair  implies  an  edge  which  could  hook  them  together  and  in  most  cases,  this 
edge  is  necessary.  Thus  we  need  only  use  Lemma  5.4  to  bound  the  number  of  op  pairs  that 
BTS  will  find,  (equivalently  we  find  vertices  in  APym„/7\  and  AP$maa,  and  then  show  which 
of  these  vertex  pairs  a  new  Base  Set  when  they  are  joined  together  by  a  0-edge.  Thus,  we 
are  counting  necessary  edges  between  pairs  of  ops. 
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Figure  5:  Graph  of  AESX 


We  will  rename  the  vertices  BTS  sees  to  emphasize  the  fact  that  it  will  find  two  aps 
and  then  try  and  hook  them  together.  Denote  the  vertices  on  the  two  aps  by  x0,yo, 
*i>yt,Z2,i/2,;r3,y3  ,...X[sm„/2\  (first  ap)  and  x0,yo  ...,xx,yx,  ... (second  ap).  x0  is 
the  first  vertex  on  the  second  ap,  and  is  chosen  by  BTS  at  depth  [6mox/2j. 

We  first  prove  the  bound  for  6max  =  2.  The  first  step  in  this  proof  is  to  bound  the  number 
of  endpoints  of  aps  of  length  one.  By  Lemma  5.4  we  have: 

|Ai>,|  >  mini  |Aft|  (^)  +  ine-*'3,  (£)  (^)J 


Next  look  at  any  Xi  €  APX  and  its  corresponding  AESX.  (See  Figure  5).  If  BTS  fails  at 
an  iteration  with  6max  =  2,  then  xx  is  adjacent  to  at  least  |AP0|  —  2  necessary  1-edges.  The 
only  0-edges  adjacent  to  a  vertex  in  APo  which  may  not  be  necessary  are  (xo,xi(x0,yo))  and 
(xo,xi),  where  the  latter  edge  may  form  a  cycle  in  AESX  U  (x0,xi)  (as  in  Figure  5). 

Thus  each  xt  vertex  =>  (jAPol  —  2)  necessary  1-edges,  and  this  gives  a  bound  for  the  total 
number  of  necessary  1-edges  BTS  will  find  (a.a.).  This  bound  is: 

1**1  >  ™n[(^)(|/tfl,l  +  ine-^XMPol  -  2)i  ,  (^)(^i)(|A/>„|  -  2)1] 

>  .nin((^)|/U>„|3i  ,  -  2)j] 

(The  1/2  in  the  expressions  accounts  for  counting  paps  up  to  twice,  once  in  either  direc¬ 
tion.) 

Next,  we  prove  the  general  case  for  Smax  odd  and  >  3.  Recall,  we  assume  BTS  fails 
during  iteration  k.  To  prove  the  bound  for  the  number  of  necessary  1-edges  BTS  will  find 
(a.a.),  we  look  for  necessary  edges  between  the  two  possible  aps. 

As  before,  if  we  could  find  a  lower  bound  for  the  guaranteed  number  of  these  ap  pairs,  and 
consequently  a  lower  bound  for  the  number  of  edges  used  to  hook  these  pairs  together,  we 
could  find  a  lower  bound  for  the  number  of  necessary  l-edges  BTS  will  find.  (The  necessary 
1 -edges  are  the  edges  which  can  be  used  to  hook  together  the  two  aps.) 


Figure  A  Figure  B  Figure  C 

Figure  6:  Graphs  of  AES2  and  2  possible  AES4 ,  where  Smax  =  5. 

We  proceed  as  in  the  case  where  Smax  =s  2.  First  we  bound  the  number  of  endpoints  of 
aps  of  length  \Smaxl2\.  This  is  equivalent  to  bounding  the  size  of  \AP\smaM/7\\-  Again,  by 
Lemma  5.4  this  is  as  follows: 


We  next  look  at  any  vertex  denoted  by  an  X[sm„/ aj  and  its  corresponding  AES[gm)U,/ 2j. 
(More  than  one  AES  may  exist:  choose  one.)  Using  AESy^/2 j  as  a  Base  Set,  we  want  to 
bound  the  number  of  endpoints  which  imply  necessary  edges  in  AESsmmjl-i  sets.  We  will  do 
this  by  counting  the  endpoints  of  only  certain  aps.  (We  continue  to  use  the  new  notation  for 
ap  vertices  defined  at  the  start  of  this  proof). 

The  following  characterizes  the  second  aps  that  we  will  count.  Given  a  fixed  ap  of 
length  ($ma*  -  l)/2  (since  6max  is  odd,  [6masl2\  =  (Sm„  -  l)/2)  and  its  acceptable  edge 
set  AES(smma-t)/2,  any  ap  found  from  this  Base  Set  will  imply  a  necessary  edge  of  the  form 

i{  the  following  are  satisfied: 

1.  Xq  ji  X0 

2.  No  id  is  in  same  component  with 

It  is  easy  to  show  that  ihese  conditions  are  sufficient.  Condition  1  is  needed  since  io 
may  have  0-degree  2  and  thus  may  not  be  used  to  start  a  new  ap.  If  the  second  condition  is 
satisfied  no  cycles  will  be  created  if  the  necessary  edge  is  added  as  a  0-edge.  (See  Figure  5) 
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choose  this  vertex 


Figure  B  Figure  C 


Figure  7:  Adding  edges  in  second  ap ,  6max  =  5,  X(*_i)/2  =  x2 

Given  any  and a  corresponding  AP5[am„/2  j ,  recall  that  in  the  proof  of  Lemma  5.4 

we  used  the  following  idea:  first,  for  any  edge  added  (here  we  add  (xd,  yd))  during  an  itera¬ 
tion  of  BTS,  we  take  into  account  the  removal  of  only  one  of  two  possible  edges.  The  edge 
we  choose  to  account  for  is  called  (yd,  id+i)  and  we  refer  to  the  vertex  xj+1  as  Xd+i(xd,y<i)- 
Though  we  may  have  two  choices  for  the  vertex  xj+j  (neither  imply  cycles),  we  choose  only 
one  to  count.  In  this  analysis,  we  stipulate  that  if  yd  is  in  the  same  component  as  the  fixed 
vertex  X($m„-i)/a  (in  the  current  AES),  we  choose  xj+1  so  that  it  will  not  be  in  the  same  com¬ 
ponent  as  X(im„_1)/2  in  the  next  AES.  (See  Figure  6A,  our  fixed  vertex  is  x2  =  X(«m„-i)/a)- 
Figure  6B  shows  the  corrext  choice  for  x2,  Figure  6C  the  incorrect  choice  for  x2.  ) 

As  long  as  x0  is  not  initially  in  the  same  component  with  x^majr_ j>/2,  we  can  always  make 
this  choice.  No  cycles  will  ever  be  formed.  The  proof  and  bounds  will  still  be  valid.  See 
Figure  7  for  the  three  possible  results  of  adding  the  next  edge  to  the  second  ap.  7 A  and  7B 
illustrate  how  to  choose  the  next  vertex.  7C  forms  a  pap ,  which  cannot  happen  since  BTS 
fails  during  this  iteration. 

We  are  basically  done.  We  need  just  note  that  for  any  fixed  ap  and  its  corresponding  AES, 
we  can  start  the  second  ap  from  (|APo|  —  2)  vertices.  (This  will  follow  the  same  reasoning 
as  in  the  case  for  6max  =  2).  We  may  not  be  able  to  start  the  second  op  from  x0  or  from 
a  vertex  in  the  same  component  with  x\sm„/i j.  (See  Figure  5).  As  in  the  bound  for  the 
number  of  endpoints  of  the  first  ops,  we  use  Lemma  5.4.  The  number  of  vertices  BTS  will 
find  that  we  have  called  X(smma-\)I2  is  no  less  than: 

■nin[(^)l<"“-1>/,(|A/>0|  -  2),  <£)(^)1 


We  need  now  only  multiply  the  two  bounds  together,  and  divide  by  two.  This  will  ac¬ 
count  for  counting  the  edges  at  most  twice,  as  the  endpoint  pairs  can  be  used  in  both  ops. 
Note  that  |;4Pi|(|APo|  -  2)  >  i^^|AP0|2.  We  have  also  have  accounted  for  this  in  the  final 
expression  of  this  lemma.  This  concludes  the  proof  for  6max  odd. 


The  expression  is  proved  for  6max  even  and  >  4  in  exactly  the  same  way.  The  only  dif¬ 
ference  is  that  the  second  ap  is  one  edge  shorter  than  the  first. 

□ 


We  now  start  the  third  step  of  our  analysis,  which  is  to  find  an  upper  bound  for  the 
number  of  0-edges  seen  in  the  Base  Sets  7*,  0  <  k  <  k  —  1. 

Now,  instead  of  thinking  about  BTS  in  terms  of  iterations,  it  will  behoove  us  to  think 
about  sets  of  iterations  that  use  the  same  Smax.  We  will  do  this  by  induction  on  6max.  Later, 
we  will  analyze  BTS  by  showing  how  long  BTS  can  continue  (a.a.)  until  it  needs  to  increase 

&max- 

Definition:  B(G,k)  =  {1-edges  in  initial  tour  U  (Uk=o^*)}-  7?(G, /c)  IS  the  union  of 
the  initial  tour  and  all  of  the  Base  Sets  seen  through  iteration  k  —  1. 

We  give  a  lemma  that  enables  us  to  bound  \B(G,  /c)|. 

Recall  that  tv(6,n)  is  the  value  of  the  tour  found  when  BTS  can  search  to  a  depth  of  6. 
Also,  BTS  will  first  look  for  paps  of  length  1,  then  those  of  length  <  2,  etc.  Thus  you  may 
also  think  of  tv(6,  n)  as  being  the  tour  value  where  BTS  must  increase  its  depth  from  6  to 
6  +  1  in  order  to  continue  its  search  for  a  lower  valued  tour. 

Lemma  5.6  Given  BTS  is  at  iteration  k  and  is  using  Smax  on  graph  G,  then  the  following 
holds: 


Proof: 

BTS  runs  at  6  =  1  until  it  cannot  add  any  more  paps  of  length  one.  The  underlying  tour 
has  value  <v(l,n).  It  then  runs  using  6  <  2,  until  it  stops  (with  underlying  tour  of  value 
fv(2,n)).  BTS  keeps  increasing  depth  until  it  reaches  6max. 

The  initial  tour  uses  n  edges.  How  many  additional  edges  are  seen  in  Base  Sets  when 
BTS  searches  for  paps  of  length  one?  There  will  be  at  most 

(n  —  fu(l,n))  <  n 


Thus  if  6max  —  1,  we  have 

|5(G,k)|  <n  +  (n-fv(l,n))  <  2n 


Increasing  6max  to  2  increases  the  number  of  edges  seen  in  Base  Sets  by  at  most  2(tu(l,  n)— 
tv{2,n)).  Thus  we  have: 
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\B(G,k)\ 


<  n  +  (n  -  tv{l,n))  +  2(tv(l,n)  -  fv(2,n)) 

<  2n  +  fu(l,n)  -  2fv(2,n) 

<  2n  +  iv(l,n) 

Note  that  the  final  bound  holds  for  any  iteration  which  uses  6max  =  2.  You  can  see  this 
easily  generalizes  to  any  6max.  We  have: 

|f?(G,/e)|  <  n  +  (n  —  <u(l,n))  +  2(fu(l,n)  —  <u(2,n)) 

+...  +  it(tv(it  —  l,n)  —  tv(n,n)) 

<  2n  +  £v(l,n)  +  tv(2,n)  +  ...  +  tv(n  —  l,n)  —  K.(tv(n,n)) 

<  2n  4-  tu(l,  n)  +  ...  +  tv(n  —  1,  n) 


□ 


We  now  come  to  the  fourth  step  in  our  analysis.  We  need  to  come  up  with  an  expression 
for  the  number  of  edges  that  do  not  influence  BTS  until  the  iteration  k  during  which  BTS 
fails.  (Notice  that  since  we  assume  that  BTS  terminates  during  iteration  k,  BTS  returns  a 
tour  value  of  n  —  ITU— 1 1  and  finds  a  tour  containing  the  0-edges  in  i.) 

Before  we  can  go  any  further,  we  need  to  define  our  set  of  noninfluential  edges. 

Definition:  X  C  E0  is  called  deletable  if: 

(1)  No  edge  of  X  is  incident  with  a  vertex  of  0-degree  <  c/10. 

(2)  The  edges  in  X  form  a  matching  in  Gn,e/n. 

(3) *n  £(G,/e)  =  0. 

The  deletable  set  idea  is  used  in  [2]  and  [4];  we  have  altered  it  for  our  purposes.  It  should 
be  clear  that  the  third  property  of  X  keeps  it  from  influencing  BTS  before  iteration  k.  As 
we  saw  earlier,  this  does  not  mean  that  BTS  will  run  exactly  the  same  when  edges  in  X 
become  1-edges.  BTS  may  not  see  all  previous  aps  now.  However,  BTS  will  see  exactly  the 
same  Base  Sets  as  before. 

Lemma  5.7  If\B(Gy  k)|  <  nc/ 30  and  c  >  95,  then  there  a. a.  exists  a  deletable  set  of  size 
where  7  is  any  function  that  is  O(logn). 

We  prove  this  in  4  steps. 

1.  B(G,k )  can  contain  all  edges  incident  to  no  more  than  than 
20(|B(G,  k)|  -  n)/(c  -  10)  large  vertices. 

2.  X  has  at  least  n  -  ne-Jc^3  —  20(|£(G,  k)1  -  n)/(c  -  10)  =  a  large  vertices  left  to 
choose  edges  from. 

3.  If  |f?(G,  #c)|  <  nc/ 30,  the  number  of  edges  left  to  choose  from  is  >  a(a  -  l)c/10n 

4.  Find  a  matching  from  vertices  in  2,  using  Lemma  5.1  and  Lemma  5.3. 
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5.4  Putting  it  all  together 


Next  we  use  all  of  the  previously  found  information  to  analyze  BTS.  First  a  quick  preview: 
in  order  to  prove  that  BTS  can  find  a  tour  of  a  certain  value  (a.a.),  we  prove  it  does  not  fail 
during  iterations  corresponding  to  larger  tour  values  (a.a.).  The  proofs  will  center  upon  the 
deletable  set  idea.  We  will  generate  a  set  X  and  let  Gx  be  the  graph  G,  except  that  edges 
in  X  have  been  changed  to  1-edges. 

The  general  idea  is  as  follows:  suppose  BTS  fails  during  iteration  «,  and  X  is  deletable. 
It  follows  that  BTS  fails  on  Gx-  Futhermore,  BTS  will  see  the  same  Base  Sets  (7*,  0  <  Jfc  < 
k  —  1)  on  G  and  Gx-  This  is  true  because  BTS  is  a  deterministic  algorithm.  Given  the  same 
realization  and  starting  tour,  BTS  will  always  perform  identically.  We  will  use  Lemma  5.5 
to  guarantee  that  if  we  run  BTS  on  Gx  the  algorithm  will  see  many,  many  necessary  1 -edges 
in  Gx-  Under  certain  conditions,  the  probability  that  none  of  these  necessary  1-edges  are  in 
X  will  be  so  small  that  P(BTS  fails  during  iteration  k)  — ►  0  as  n  — ►  oo.  Of  course  we  know 
by  Lemma  5.2  that  BTS  must  fail  by  a  certain  iteration.  However,  we  want  to  know  how 
small  of  a  tour  value  BTS  can  find,  as  a  function  of  5ma*. 

Important  Note:  The  reader  may  have  noticed  that  we  have  been  sneaky.  The  reader 
may  think,  if  we  want  to  use  those  lemmas,  that  we  need  to  go  back  and  change  them  to 
hold  true  for  Gx ,  instead  of  G.  Well,  actually  the  lemmas  that  may  be  affected  (5.4  and  5.5) 
are  true  for  Gx  as  well  as  G.  Since  we  wanted  to  do  the  calculations  only  once,  we  did  them 
for  Gx-  How  can  we  see  this?  Notice  that  we  defined  “small”  vertices  as  having  <  c/10 
incident  0-edges,  but  we  used  a  bound  for  vertices  of  0-degree  <  c/10  +  1  (see  Lemma  5.1). 
So  when  we  subtracted  off  the  small  vertices,  we  included  vertices  that  may  have  become 
small  due  to  changing  the  weights  of  edges  in  X. 

After  generating  our  edge  weights  on  Gn>c/n,  we  randomly  and  independently  color  the 
edges  of  Eq  green  with  probability  7  log  n/cn.  Call  the  green  edges  Eog.  We  will  want  to  see 
if  these  edges  can  be  a  deletable  set.  Note  that 

Exp[\E09\]  =  *  (7 logn)/2 

i  n  cn 


we  must  restrict  our 
Let  it  be  given  that 

We  need  the  following  conditions: 

1.  \{v£V:  d0(v)  <  c/10  -h  1  }|  <  ne_2c/3 

2.  d0(v)  <  4  log  n  for  all  v  €  V 


The  following  method  is  analogous  to  that  in  Frieze  [4].  First, 
Bernoulli  distribution.  We  let  c  >  20(£  +  l)log(£  +  1),  where  l  >  9. 
BTS  has  reached  iteration  k. 
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3.  |*,l  >  min[fci|/li>0|U,  fc5§(|AP„|-2)l  ' 

>  minK^)* - 

for  6max  greater  than  2 

4.  \B(G,k)\  <2n  +  ZSs=Y-ltv(6,n) 

5.  If  \B(G,k)\  <  nc/ 30,  then  there  a. a.  exists  a  deletable  set  of  size  — , 
where  7  =  O(logn). 

Let  L  be  the  event  that  conditions  1-5  hold.  Note  that  P(L)  =  1  —  o(l). 
Define  the  following  events: 

Sx  =  [( BTS  tails  during  iteration  k  on  G)  D  L\ 

S2  =  {S\  n{X  =  Eog  is  deletable)  ] 


We  will  investigate 


P(£i)  = 


P&) 

Pie* \ex) 


We  first  investigate  P(S2\Si).  We  will  use  \B(G,k)\,  which  by  definition  is  an  upper 
bound  for  \B(G,  /c)|  which  holds  a.a. 


Lemma  5.8 

P(S2 \€i)  >  (1  -  o(l))(l  -  li2i!i)IB(C.«)l+(c/10)ne-^/;» 

CTl 


Proof:  If  BTS  fails  on  G ,  X  need  only  satisfy  its  3  requirements  up  to  iteration  k. 
We  have  that: 

P(e2\€i)  =  P{x  n  B(G ,  k)  =  0,  x  n  Eq(SMAll)  =  0, 

X  forms  a  matching  ) 

=  P(X  forms  a  matching  \X  fl  ( B(G ,  k)  U  Eq{SM  ALL))  =  0) 

*  P{X  n  (B(G,  k)  U  Eq(SMALL))  =  0) 


We’ll  show: 

(1)  P(X  forms  a  matching  \X  fi  ( B{G,k )  U  Eq(SMALL))  =  0)  =  (1  -  o(  1 )) 
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(2)  P{X  n  (B(G,k)  U  Eo(SMALL))  =  0)  >  (1  -  2^2)|S(C.'‘)l+(=/lO)ne-Je/J 
to  prove  the  Lemma. 

To  show  (1),  let  a  =  [  number  of  vertices  with  at  least  2  incident  edges  in  X).  Next  use 
the  Markov  Inequality  and  show  that  Exp(a)  =  o(l/n). 

Item  (2)  is  easily  derived  by  noticing  that  there  are  <  (c/10)ne~2c^3  0-edges  (attached  to 
small  nodes)  that  may  not  be  colored  green.  By  definition  \B(G,  k)|  is  an  upper  bound  for 
\B(G,  k)|.  Later  we  will  find  a  valid  value  for  \B(G,  /c)| 

(Full  proofs  may  be  found  in  the  appendix.)  □ 


Now  we  will  investigate  £2. 


Lemma  5.9 


P&)  <  ((1  - 


(7  log n)(c)  i^, 
(cn)(n)  n 


To  prove,  follow  these  simple  steps: 


£2  ~  [£\  n  X  =  Eog  is  deletable  ] 

S2  =>  [BTS  fails  on  during  iteration  k  AND  X  fl  =  0  AND  L  occurs  ] 


thus 

£2  C  [BTS  fails  on  Gx  during  iteration  k  (!(<*’  FI  'I ’5mM(G*)  =  0)f|L  ] 
and  clearly 

P[£2)  <  P(X  n  =  0) 

Now,  recall  that  the  edges  e  6  X  were  colored  independently.  By  definition  is  a 

lower  bound  for  Thus  we  have: 

P(£ >)  <  [/>(«*  t 


where  e*  €  tf**,*,.  Hence, 


mqj ! 
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□ 

Putting  the  two  Lemmas  together: 

Theorem  5.1  Given  a  random  graph  Gn,c/n •  The  following  holds  a. a.: 

P(BTS  fails  during  iteration  k  | BTS  succeeds  up  to  iteration  k  —  1)  = 


P<A)  = 


PA)  „ 

P(£2 \£l)  ~  (1  -  0(1))(1  -  2^)|B(G.*)|+(c/10)»e-’«/* 


where  7  is  any  function  that  is  O(logn),  and  we  must  have  \B(G,  /c)|  <  nc/ 30. 


Next  we  look  for  relationships  between  |£(G, x)|  and  Given  BTS  is  at  iteration 

k,  we  want  to  find  the  smallest  6max  we  can  use  and  still  have  the  probability  BTS  fails  go 
to  0.  By  keeping  Smax  as  small  as  possible,  \B(G,  k)|  is  smaller  and  we  can  proceed  to  do 
more  iterations  before  \B(G,  /c)|  >  nc/ 30. 

Please  keep  in  mind  that  the  probability 
P(£ 1)  =  P((BTS  fails  during  iteration  k  on  G)  H  L) 

is  conditioned  on  BTS  running  to  iteration  k.  However,  we  would  like  to  have  an  uncondi¬ 
tioned  bound  for  this  probability.  We  may  use  conditional  probability  to  see  that: 


P(BTS  fails  during  iteration  k) 

* 

=  ^(BTS  fails  during  iteration  x|BTS  gets  to  iteration  k ) 

feat 

*P( BTS  gets  to  iteration  k) 


Since  BTS  can  have  up  to  n  iterations,  we  need  P{£\)  <  o(l/n).  It  will  be  sufficient  to 
show  that  (a.a.) 


o(l  In) 
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Why?  Let 


A  =  (l- 


7  log  n , 


7logn 


n* 


5  B  = 


e  "■* 


(7  =  (1  _  '  ‘w8,,)lfl(Q.<«)H-(t/  10)n.-»«/»  ^  £  _  e-ai^p(|B(C,x)|+(C/10)ne-J‘/i) 


cn 


/>(£.  <(1 +«(!))£  =  (I +°(1)4*§*§ 


Now  A/B  <  1  for  all  n.  We’ll  show  that  limn_00  ^  =  1.  Let  an  =  (|B(G,  *)|  + 
(c/10)ne_2c^3).  We  will  need  the  following  property:  if  t  <  .43,  then  —  log£(l  —  t)  < 
( t  +  t2/2  +  t3/2). 


D 


_  gIimn_00(log  D— log C) 

»— 00  c 

_  elimn_O0(-7a/c)logn-anlog(l-^l!‘) 
<  glim,i_<xi(— 7a/c)  log 

=  e^ooa{^^+^-) 

=  e° 

=  1 


So  now  we  know  that  if  (a.a.) 

B 
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BTS  will  find  any  tour  values  this  equation  implies. 
This  is  equivalent  to  showing  that 


+  JL(|B(G,k)|  +  (c/10)n«-W3)  <  -1 
’■*  cn 


n* 
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We  can  see  from  the  Lemma  5.4,  that  7  may  be  any  function  that  is  O(logn).  From  now 
on,  we  assume  it  is  in  fact  a  0(logn). 

Before  we  get  to  the  next  theorem,  here  is  a  corollary  to  demostrate  the  implications 
of  the  theorem.  Recall  that  tv(Smax,n )  represents  the  value  of  the  smallest  tour  BTS  finds 
using  Smax.  Let  TV(6max,n)  =  £«=o*  tv(6,n),  where  we  define  lu(0,  n)  :=  (2  +  ec)n. 

Corollary  5.1  Given  a  random  graph  G„lC/n  and  c  >  461,  (as  c  >  20(£  +  l)log(f  +  1)  and 

£>9 ).  If 

|B(G,*  +  l)|<|j  and  7T(<„„,n)  <  f(^)J 
then  the  following  holds  (a. a.): 

tv(S  +  l,n)  <  pstv(6,n) 

and  ps  converges  to  (^)*/2  +  e,  where  e  is  0(e~cI3). 

The  restrictions  on  |f?(Cr,  k  + 1)|  and  tv(S,n)  will  be  justified  in  the  proof  of  the  following 
theorem.  As  we  increase  Smax,  our  guaranteed  tour  values  decrease  geometrically.  The  gen¬ 
eral  theorem  is  a  bit  more  messy,  but  you  should  be  able  to  see  the  pattern.  The  corollary 
is  proved  after  the  proof  of  the  theorem. 

We  have  already  seen  from  Lemma  5.2  that  the  optimal  tour  value,  tv’,  can  be  bounded 
as  follows: 

tv’  >  n(l  -  c/n)"-1  +  — ~  1)C(1  -  c/n)n-2  -  n1'2  logn 

mt 


See  page  32  for  a  discussion  of  the  relationship  between  this  lower  bound  and  the  upper 
bounds  given  in  the  previous  theorem. 

Theorem  5.2  Given  a  random  graph  Gn,c/n.  The  following  hold  a. a.: 

( 1 )  If  6max  =  1  and  c>  10 

n  c 

tv(\,n)  <  Jv(\ ,n)  =  — ,  where  j  +  jlogj  <  — 


(2)  Let  £  >9  be  fixed,  where  (£  +  1)  log(£  +  1)  <  c/20.  (Thus  c  >  461/.  Then 


tv(2,n)  <  tv(2,  n) 


^2TV(l,n)n  2  ^1/2 
c  £  ~  5 
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tv(£max t  ^  tv {6  max  j  w)  —  [ 


2TV(6max  -  l,n)n  2 


(7r^5ma*"ll1/2 


for  6max  >  3  and  as  long  as  tv(8max,n)  >  fne“2c/3,  and  TV(8max,n)  <  y(^)2. 

Proof: 

We  prove  the  case  for  8max  =  1  first.  Suppose  that  BTS  is  starting  iteration  it,  and 
[n  -  |T«_i|]  >  n/j.  The  number  of  vertices  with  0-degree  <  2  in  TK. x  is  =  \AP0\  > 
[n-l^-il  +  lj. 

By  Lemma  5.3,  |E„(AP„)|  >  £ip||£„(/lP„)|]/5  =  >  -a,. 

The  number  of  0-edges  in  [E0(AP0)  n  Tk.i]  =  |4P0|  -  (n  -  IT^-il)  -  g,  where  g  =  the 
number  of  components  of  size  >  3  vertices  in  T*_j.  ( g  =  number  of  0-edges  that  would  form 
cycles  in  AP0,  see  Figure  4  for  reference.)  Thus  the  number  of  edges  in  E0(AP0)  that  can 
form  paps  of  length  1  is  at  least 


c|apo|(i<4p„i  >  fWiq-y.i  - 1)  _  ]APa] 


lOn 


> 

>  1 


lOn 

c(\AP0\-l) 


10 j 


- \AP0\ 


since  c/10  >  j  +  j  log  j  . 


Notice  we  did  not  need  the  idea  of  a  deletable  set  in  this  part  of  the  proof;  that  will 
come  next.  We  have  found  an  upper  bound  for  fv(l,n),  the  smallest  tour  value  BTS  can 
find  (a.a.)  using  paps  of  length  1  only.  We  can  now  plug  this  bound  and  others  which  we 
have  computed  into  the  inequality  from  Theorem  5.1  to  find  an  upper  bound  for  tv(2,n). 
This  is  the  smallest  tour  value  found  by  using  paps  of  length  1  and  2  only.  In  order  to  keep 
\B(G,  «)|  small,  we  have  made  sure  that  BTS  uses  8max  =  1  while  |7’*_i|  >  n/j,  where  n/j 
is  as  in  the  above  lemma.  Once  we  reach  the  minimum  n/j,  BTS  will  use  8max  =  2. 

To  prove  the  bound  for  fv(2,n),  we  want  to  find  valid  bounds  for  and  \B(G,  k)| 

such  that: 


e^l»l^2i+x!2p(|5(C,*)|+(c/I0)ne-^3)  =  0(1  fa) 


or  equivalently 

=  o(l/n) 


23 


It  suffices  to  show  that  (a.a.) 

^rl2ji+^(|fl(G,*)l  +  (c/10)ne-!‘«)  <  -1 


By  algebra  we  see  that  we  need, 


n 


-  2c/3 

|*,|>^(|B(G.ic)|  +  21_+~) 
-  c  10  7 


Now,  we  let  \B(G,  *)|  =  2  +  fu(l,n).  Since  te  >  +  ^e-2c/3)  we  can  can  solve: 

l*,l>  =[2+,t+Mi^]=  22L") 

‘-™,—  r  n  /• 


Recall  that  (a.a.): 

I'M  >  mm£^|AP0|a£,  L--5^(|APo!  -  2)^] 

where  \APq\  >  £ne~2c^3.  We  know  that  given  any  Base  Set,  the  tour  value  it  implies  is 
<  |APo|  —  1.  If  we  can  solve 


mnl^-Wol’j,  V^(I‘4P°I  ”  2)5] 


rvu n) 

c 


for  |APol,  this  will  imply  a  tour  value  which  can  be  found  a.a. 

After  algebra: 

^(|Aft|  -  2)]  =  2^^^ 


Since  we  are  trying  to  minimize  |AP0|,  we  may  assume  that  |AP0|2  <  ^(|AP0|  —  2).  Thus 


If  we  know  that 


<£-2><Lr>^3T<1'"> 
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then  the  above  equation  is  feasible. 
Minor  algebra  then  gives: 


<v<2.»)  +  1  <  |A/>„|  =  [  *  TV(l.n)n  = 

t  —  0  C 


Thus  we  are  (a.a.)  assured  of  BTS  finding  a  tour  of  at  most  the  above  value.  This  gives 
the  desired  result. 

To  prove  the  rest  of  the  bounds,  assume  they  are  true  for  all  6  <  Smax.  To  show  this  is 
true  for  6max  +  1  we  need  to  find  the  minimum  |AP0|  that  satifies: 

+  -(|B(G,«)!  +  (c/lOlne-*'3)  <  -1 
n* - * —  cn 


Recall  that  (a.a.): 

I*WmI  >  |i4P0|a^,(^-^)a(^)2^] 


Since  both  terms  are  less  than  (a.a.)  and 

TV(6max,n)  >  \B(Gy  /c)|  +  (c/10)ne_2c/3)  +  ^  we  may  solve  for: 


min[( 


*  ”  5  \S„ 


21  (i  -  5x2.  n  21 


'l^ol  2 


)(27,§1  = 


TV{6max,n)n 


As  before,  since  tv(S,  n)  <  IAP0I1  this  will  give  us  a  valid  bound  for  tv(S,  n). 
As  we  are  trying  to  minimize  \AP0\,  we  may  assume  that 


( 


MPol'j  <  ( 


/  —  5.2.  n  .2 1 
2  ’  K2 V  2 


As  before,  this  implies  that  TV(6max,n)  <  y(^)2,  or  else  no  feasible  solution  will  exist. 
Thus  tv(6max  +  l,n)  is  such  that: 

+  !.")<  MBol  -  1  <  =  S7(«m„  +  l,n) 

C  t.  ■"  0 

and  this  proves  the  theorem,  as  long  as  tv(6max  -f  l,n)  >  £ne-2c^3. 
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For  example,  if  l  =  11  and  c  >  597: 


tu(l,n)  < 


n 


15.4  ’ 


tv(2,n)  < 


n 

208 I  ’ 


£u(3,n)  < 


n 

35.64 


tv(4,n)  < 


n 

61.33  ’ 


fu(5,  n)  < 


n 

105.83 


For  all  6  >  5,  we  have: 


tv(S  +  l,n)  <  .57862  tv(6,n) 


as  long  as  tv(S  +  l,n)  >  llne-2^3,  and  TV(S,n)  <  5.5 n  .  The  rate  of  decrease  is  already 
very  close  to  1  /  y/3  w  .57735 


To  prove  the  corollary,  we  need  to  investigate  the  rate  of  decrease  of  the  tv(6,  n),  as  a 
function  of  8.  As  well,  we  need  to  ensure  that  |2?(G\ /c)|  <  nc/ 30,  and  TV(8, n)  <  y(^r)2. 

Proof  of  Corollary: 

We  need  only  show  that  if: 

2TV(6,n)n^  2  __  ^„[2TV(^  —  l,n)ri(  2  ^-2ji/2 

c  K  l  —  b  c  l  —  5 

then  as  6  — »  oo,  pj  — » (^)1^2  +  c,  where  c  is  0(e~c^3). 

The  first  expression  can  be  simplified  to 

TV(6, =  p*TV(6  -  l,n) 


Thus 


P  =  (1  + 


tv(8,n)  1/2  2  1/3 

TV{6-\,ny  Ke-5* 


Now  p  is  strictly  decreasing;  in  the  above  expression,  the  numerator  is  decreasing,  and 
the  denominator  increasing.  Note  that  from  the  previous  theorem: 

xl/2  < 

2nTV{6-\,nY  5}  1 
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as  long  as  fu(6,n)  >  int~2c^.  We  have  that  TV(S  -  l,n)  >  2 n,  so  as  well: 

<  fI/_2__\5-2|l/2 

TV(S- l,n)  “  ic(£-5J  J 

for  large  i  and  n  this  will  imply  that  tv(S,  n)  <  2 £ne~2cl3.  This  proves  the  corollary.  □ 
Note:  Though  we  must  always  satisfy  TV(6,n)  <  y(^)2  |fl(G, /c)|  <  nc/30, 
sometimes  this  is  always  true.  At  £  =  9,  we  can  see  that  we  must  have  TV(S,n)  <  2.845n, 
but  if  £  is  only  a  little  larger,  nice  things  happen.  For  example,  let  £  =  11  and  c  >  597  as 
we  did  before.  £  =  11  =>  we  must  have  TV (6,  n)  <  5.55n.  Notice  that: 


TV(6,n) 


=  rV(0,n)  +  ^HJ(d,n) 

d=l 

<  TV(5,  n)  +  Yi  n) 

d=6 

<  2.16786n  +  £(.57862)^ 

<  2.2n 


Thus  we  need  not  worry  that  TV(6,n)  >  y(^jr)2  or  \B(G,  /c)|  >  nc/30. 

The  first  theorem  gave  us  the  generalization  of  the  analysis  to  include  the  probability 
BTS  can  find  a  tour  of  a  certain  value,  given  Smax  (these  are  true  a.a.).  We  need  only  plug 
in  the  values  that  we  have  found  for  |f?(G,  «)|  and  I'PsI,  which  are  calculated  by  using  the 
tv(8,  n)  we  have  just  found. 

The  corollary  and  final  theorem  tell  us  that  as  8max  is  increased,  BTS’s  (asymptotic) 
guaranteed  tour  value  will  converge  at  a  rate  which  approaches  2/(£  —  5)^2 ,  to  a  minimum 
tour  value  of  £e~2c/3  +  1.  However,  we  can  find  smaller  guaranteed  tour  values  by  using 
different  bounds,  for  example,  the  other  bound  we  found  implies  a  minimum  tour  value 
of  5c~2c^3  +  1  (but  n°t  33  f38*  311  improvement  rate).  As  well,  this  analysis  rests  on 
a  neighborhood  argument  (which  we  use  to  bound  the  number  of  necessary  edges  seen  by 
BTS)  which  sometimes  uses  only  vertex  sets  with  a  large  0-degree.  That  is,  we  do  parts  of  this 
analysis  using  vertices  that  have  0-degree  >  c/10.  By  adding  in  contributions  from  ‘small’ 
vertices  we  can  obtain  smaller  guaranteed  tour  values  for  large  6.  The  rate  of  improvement 
that  we  find  can  still  approach  2/(£  —  5)1^3,  and  the  best  tour  value  we  can  find  is  still 
0(ne~2c ^3). 

Now  consider  these  two  things.  First,  the  expected  value  of  a  randomly  chosen  tour  is 
n  —  c.  Even  using  small  values  of  6max  lets  us  find  tours  with  a  much  smaller  guaranteed 
tour  value.  Second,  by  looking  at  the  minimum  number  of  vertices  of  0-degree  0  or  1  (see 
Lemma  5.2),  we  find  a  very  quick  (and  loose)  lower  bound  for  the  optimal  tour  value.  This 
bound  shows  that  the  optimal  tour  value  is  never  less  than  n(l—  c/n)n~I-|-^n~1^(l—  c/n)"~2- 
n1 12  log  n  .  A  much  tighter  bound  can  no  doubt  be  found,  and  we  think  it  will  be  much  closer 
to  0(ne~2c^3). 


32 


Our  lower  bound  for  the  optimal  tour  value  will  be  closest  to  the  upper  bound  for  our 
algorithms  performance  (with  6max  large)  when  the  above  bound  is  close  to  (ne~2c/3.  This 
is  smallest  when  c  is  large.  For  example  if  c  =  logn,  the  difference  between  the  upper  and 
lower  bounds  is  0(n1/3)  but  if  c  =  0(1)  then  the  difference  is  0(n). 


6  Modifications  to  BTS 

6.1  Analysis  of  2-opting 


In  the  previous  section  we  mentioned  that  BTS  could  be  reformulated  as  a  strict  im¬ 
provement  algorithm.  Here  we  present  results  which  come  from  the  modification  of  BTS 
into  a  strict  2-opt  heuristic.  The  key  idea  in  the  modification  is  to  find  a  way  to  identify 
which  aps  and  paps  translate  into  2-changes.  The  full  analysis  can  be  found  in  [13]. 

Theorem  0.1  Given  a  random  graph  0n,c/n •  If  te  is  a  small  constant  such  that  tc  > 
(^  +  fQe-^)  (where  7  is  a  function  ofO(\ogn))  and  c  >  95  (for  c  must  satisfy  requirements 
of  Lemma  5.7  and  then  2-opting  can  find  (a.a.)  a  tour  of  length  tv,  such  that 

tv  <  fin) 


What  are  the  time  requirements  of  this  2-opt  heuristic?  We  know  that  we  can  do  a 
2-change  in  time  O(logn)  since  this  is  equivalent  to  a  sequential  exchange.  As  stated  in  the 
last  section,  we  know  that  the  0-degree  of  any  vertex  is  <  4  log  n  a.a.  At  any  iteration  we 
may  have  to  search  from  0(n)  vertices,  and  we  have  0(n)  iterations.  It  follows  that  the 
time  requirements  are  0(nJ(logn)2).  Note  that  this  is  the  same  as  the  time  requirements 
for  algorithm  BTS,  given  that  Smax  =  1. 

According  to  our  analysis,  the  performance  of  our  2-opting  heuristic  never  surpasses  that 
of  BTS  (when  6max  =1).  Notice  that  2-opting  is  quite  tour  dependent,  and  will  perform 
better  on  tours  that  have  1-edges  grouped  together.  This  would  imply  that  using  2-opting 
after  a  nearest  neighbor  tour  construction  is  probably  a  good  idea.  The  nearest  neighbor 
algorithm  will  produce  exactly  the  tours  with  which  2-opting  is  most  successful.  This  suggests 
that  a  modification  of  our  2-opt  heuristic  would  probably  be  in  order.  We  would  want  to  do 
successful  2-changes  (those  that  lead  to  a  lower  valued  tour)  that  keep  the  most  number  of 
grouped  1 -edges.  (However,  this  is  beyond  the  scope  of  this  paper.  Perhaps  we  can  tackle 
this  idea  in  the  future.) 

However,  given  our  comparison  of  time  requirements  of  BTS  and  2-opting,  it  would  seem 
unwise  to  use  2-opting  at  any  time.  Even  if  we  want  to  use  the  nearest  neighbor  algorithm 
to  find  an  initial  tour,  BTS  will  still  outperform  the  algorithm  restricted  to  be  a  strict  2-opt 
procedure  given  the  same  amount  of  time.  We  suspect  that  BTS  could  also  be  enhanced  by 
choosing  paps  which  keep  1 -edges  grouped  together. 


33 


6.2  Further  Implementations  of  Procedure  Search 

The  procedure  search  can  be  implemented  on  a  directed  or  assymetric  0,1  TSP.  Tour 
improvement  algorithms  like  2-opting  are  generally  not  well  qualified  to  handle  the  directed 
TSP.  Picture  doing  a  2-change  on  a  (directed)  tour.  Up  to  1/2  of  the  edges  may  change 
direction,  and  radically  change  the  tour  value.  But  Search  can  handle  these  problems  by 
keeping  0-edge  components  in  their  original  direction.  The  same  kind  of  asymptotic  analysis 
presented  in  this  paper  can  be  performed  for  the  asymmetric  Bernoulli  Salesman.  We  need 
only  make  sure  we  keep  components  in  their  original  direction  at  all  times.  The  most 
restricted  step  will  be  in  the  choice  of  final  edges  on  paps.  The  directed  case  may  also  be 
analyzed  in  this  manner  if  there  are  enough  8-edges  present  in  the  graph. 

We  also  stated  that  Search  may  be  implemented  as  a  strict  improvement  algorithm.  The 
current  theorem  (for  symmetric  salesman)  still  holds  true  for  this  implementation,  though 
the  time  requirement  grows.  Since  searching  up  to  depth  8  implies  we  are  using  up  to  28  +  1 
edge  exchanges  this  analysis  is  true  for  a  28  +  1-opt. 

In  addition,  we  can  modify  Search  to  perform  as  a  strict  k-change  algorithm.  This 
modification  is  very  easy  to  do  for  2-opting  or  3-opting,  although  a  little  harder  for  a  general 
k-opt.  We  have  already  given  results  from  the  analysis  of  2-opting.  The  analysis  techniques 
can  be  used  for  any  strict  k-opt,  however,  the  increasing  complexity  makes  it  hard  to  obtain 
good  bounds. 

7  Appendix 

For  the  following  lemmas,  let  G  =  Gn,c/n  ,  where  ,  c*  <  c  <  1 . 1  log  n  and  c*  is  a  large 
constant 

Proof  of  Lemma  5.2 

Lemma  5.2  Let  5,  =  {v  :  vertex  v  has  0-degrec  :},  then  G  a. a.  satisfies  the  following: 

(5.2.1)  ||So|  -  n(l  -  c/n)n_1|  <  n1/2  l0gn 

(5.2.2)  ||Si|  -  (n  -  l)c(l  -  c/n)n~2|  <  n1/2  logn 

(5.2.3)  |(|S0|  +  |Si|)  -  n(l  -  c/n)n_l  -  (n  -  l)c(l  -  c/n)n~2 1  <  nl/3logn 
Proof: 

We  will  prove  only  the  first  expression,  the  proofs  of  the  others  are  similar. 

To  establish  the  first  expression  we  use  Chebyshev’s  Inequality  which  tells  us  that: 

P(|  | Sol  -  Sxp(|So|)|  >  n>'Mogn)  < 
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We  have  that: 


£*p(|S0|)  =  n(V)(£)°(l-£rl 

n  n 

=  n(l-£rl 

n 

Now  we  look  at  So  in  terms  of  indicator  variables: 

ISol  =  ±  /«(«,)  where  /„(«,)  =  {  J  =  o 

This  allows  us  to  find  an  expression  for  the  variance  of  So. 

Kar(|So|)  =  ^[(^/oCu,))2]-^^^)]2 

j=i  j=i 

=  nExp[I0{v j)2]  +  2(^)i:xp[/o(u1)/o(t;2)]  -  n2£xp[/0(u1)]2 
=  n£xp[/o(t/i)]  +  n(n  -  l)Exp[I0(vi)Io{v2)]  -  n2£xp[/0(ul)]2 

We  can  simplify  this  by  noting  that: 

£xp[/0(i;i)/o(v2)]  =  P[/o(vi)  =  /o(v2)  =  1] 

=  =  1  |/0(v2)  =  1]  P[Io{vi)  =  1] 

=  (1  -  £)-*  (1  -  £)-» 
n  n 

=  (1  -  -)2n~3 
71 

Plugging  in: 

Var(\So\)  =  n(l  —  — )n_1  +  n(n  —  1)(1  —  £)2n_3  —  n2(l  —  £)2n-2 

n  n  n 

=  n(l  -  £)n-1  +  n2(l  -  £)2n-2[(l  -  £)-‘  —  1]  —  n(l  —  £)2n’3 
n  n  n  n 

=  n(l  -  £)n’1  +  n2(l  -  !■)*»-*[—}  -  „(1  -  £)2"-3 
n  n  n  —  c  n 

<  n(l  —  £)n_1  +  2nc(l  —  £)2n-2 
n  n 

Using  Chebyshev,  the  proof  of  (5.2.1)  is  complete. 

□ 

Proof  of  Lemma  5.3 
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Lemma  5.3  Given  Gn<c/n  =  (V,  E),  let  S  C  V.  Let  Eq(S)  =  {e  =  (u,u>)  :  v,w  €  S  and 
(v,w)  is  a  O-edge  }.  Then  for  all  0  ^  S  C  V ,  G  a. a.  satisfies  the  following: 

If  |5|  =n/j,  where  j  +jlogj  <  c/10  then  |£o(S)|  >  £xp[|£o(S)|]/5 


Proof: 

Recall  the  Markov  Inequality,  P(\X\  >  1)  <  f?xp(|X|)/l  . 

Let  a  =number  of  sets  of  size  |S|  such  that  j £*0 ( 5 ) j  <  £’ip[|£’o(‘S')|]/5.  Then 


**■>  "(A)  gOI’K 


■)‘(i  -  -)<”>-* 


n 


where  /  =  £xp[|i?o(S)|]/5  —  1.  For  algebraic  ease,  let  /*  =  (c|S|2)/ (lOn)  (/  <  /*). 
Since  the  maximum  value  of  a  Binomial  distribution  occurs  at  its  expected  value  (ap¬ 
proximately),  the  maximum  term  in  the  above  expression  occurs  at  f.  So  we  have: 


Exp(a)  < 

(is,)(r 

+l)( 

('?) 

r 

)  (£)/•(!  _  £)('?)-> 
/  n  n 

re 

< 

^tis| 

W 

,\S\2c 
'  10n 

+  i) 

!(|S|  -51|)l0e]^  (i 

_  £)('?  )■ 
n 

c|S|a 

10rt 

< 

W 

1  lOn 

+  1) 

i(|S|,7Ji)10'i® 

cjl£|Q|Lzii. 

< 

V 

,151’c 

'  10n 

+  1) 

(Se)®  c=s[liKl|I=l2_ 

lSJlsi 

10n  « 

< 

W 

(\S\2c 
1  lOn 

+  1) 

2T|S|ac  -c|S|a  .  -ca|5|* 

g  n  g  In  T  io„2 

+ siSl 

T  2n 

< 

W 

-.13.151*  -<JLSl3tclSl 

t  n  e  10  n*  2n 

< 

e  " 

1 

the  final  expression  will  go  to  0  as  n  — »  oo  if  j  +  j  log  j  <  c/10. 

□ 


Proof  of  Lemma  5.7 

Lemma  5.7  If\B(G,  «)|  <  nc/30  and  c  >  95,  then  there  a. a.  exists  a  deletable  set  of  size 
7l°Sfl  ,  where  7  =  O(logn),  and  c  is  greater  than  a  large  constant. 
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Proof: 

By  the  definition  of  a  deletable  set,  all  edges  in  X  must  be  incident  to  large  vertices  only 
(one  with  0-degree  <  c/10).  However,  B(G,k)  may  contain  many  edges  incident  to  large 
vertices  as  well.  How  many  large  vertices  can  B(G,k)  completely  cover  ( B(G,k )  contains 
all  edges  incident  to  how  many  vertices)? 


1.  n  edges  in  B(G,  k )  are  in  the  initial  tour,  so  each  vertex  haw  at  least  2  of  its  incident 
edges  in  B(G,k).  Thus  B(G,k)  can  use  up  to  the  following  number  of  large  vertices: 


2(|fl(G,*)|-n)  _  20(\B(G,  «)|  —  n) 

£-1  c  ~  io 


2.  So  the  number  of  vertices  left  that  X  has  to  choose  from  is  at  least: 

_  2c/3  _  2O(|0(G,«)|-n) 

c-  10 


Assume  |£(G,k)|  <  nc/30,  where  c  >  10.  The  the  number  of  vertices  left  that  X  has  to 
choose  from  is  at  least: 


-2c /3  2/3n(c  — 30)  ,1  2c/3  ^  n] 

n~ne - : — —  >  n  I,  - e  >  tJ 
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3.  By  Lemma  5.3  if  10(4  +  4  log  4)  <  c  (which  implies  c  >  95),  then  the  number  of  0-edges 
present  in  the  subgraph  induced  by  the  laurge  vertices  that  X  cam  choose  from,  is  at  least: 

(n/4)ac  _  cn 
lOn  =  160 

4.  We  can  find  an  acceptable  X  by  finding  a  matching  from  this  subset  of  edges  and 
vertices.  Each  vertex  haw  <  4  log  n  0-edges  incident  to  it,  so  we  can  greedily  find  a  matching 
of  the  desired  size.  Any  first  edge  chosen  may  exclude  at  most  (8  log  n  —  1)  edges  from  the 
matching.  The  second  edge  chosen  may  exclude  (8  log  n  -  1)  more,  etc.  In  this  way,  we  can 
easily  find  a  matching  of  size  at  least: 

cn/160  _  cn  7logn 

81ogn  1280  logn  —  2 

if  7  =  O(logn)  and  the  Lemma  is  proved. 

□ 

Proof  of  Lemma  5.8 
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Lemma  5.8  Given  that  Gn,c/n  satisfies  the  conditions  from  Lemmas  5.1  -  5.7,  and  that 
edges  in  X  were  chosen  independently  from  E0{G)  with  probability  (1  -  pp): 

P(S2\€i)  >  (1  -  0(1))(1  -  ll2l!i)(«/W)n.-*‘/3  +  |fl(C.<.)| 


Proof: 

We’ll  show: 

1)  P(X  forms  a  matching)  >  (1  —  o(l))  - 

2)  P(X  D  (Ea(SMALL)  U  B(G,  *))  =  0)  >  (1  -  2b«ft)(«/io)n.-a‘/*+I5TOI 
to  prove  the  Lemma. 

1)  Let  (3  =  \{v  :  v  has  >  2  incident  edges  in  X\X  fl  ( E0(SMALL )  U  B{G,  *))  =  0}| 


W>  1)  < 


Exp(fi) 


—  vertex  V{  has  >  2  incident  edges  in  X\X  n  ( E0{SMALL )  U  B(G ,  *))  =  0) 

t 

<  P(  vertex  Uj  has  >  2  incident  edges  in  X) 


=  „£(V‘)(2iSS«)»(i-2l!5S«r-< 

3  cn  n  cn  n 


=  »u  -  (i  -  ipr1  -("-D  <^F  a  -  ^Pr’)i 
=  »[i-(i-2l|»r*(H.(.-2)2ip)l 


7logn  Tlogn  2 


n*  '  '  '  v  '  nJ 


Now  we  show  that 


(1_i^r>>  2)1^: 


This  is  true  since 


_  ^lQgn\n-2 

r>2  * 


i-(n-2)^F+(r 
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r 


and  for  all  1  <  j  <  n  -  1: 


-2N/7logn 


J+l 


n* 


n‘ 


Thus 

PW  >  1)  <  n{l-[(l-(n-2)2^)(l  +  (n-2)i^]} 

- 

=  »[((»  -  2)^)’] 

<  (7  l°gn)2 


and  thus  1)  is  proved. 

2)  The  edges  in  E0  were  colored  independently,  with  the  probability  any  0-edge  given  by 
green  is  (1  —  Thus: 

P(X  D  (Eq(SMALL)  U  B{G,  «))0)  =  (1  -  li282)l&>(SM>iLL)ufl(c.*)| 

cn 

>  (j  7logn^c/io)ne-^/»+|B(G.K)| 

~~  cn 

since  there  are  <  (c/10)ne_2c^3  0-edges  adjacent  to  small  vertices. 

□ 
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Abstract 

A  local  minimum  of  a  matrix  is  a  cell  whose  value  is  smaller  than  those  of  its  four 
adjacent  cells.  For  an  n  x  n  square  matrix,  we  find  a  local  minimum  with  at  most 
2.554n  queries,  and  prove  a  lower  bound  of  y/2n  queries  required  by  any  method.  For 
a  different  neighborhood  corresponding  to  the  eight  possible  moves  of  a  chess  king,  we 
prove  upper  and  lower  bounds  of  3n  +  0(log  n)  and  2n,  respectively. 
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Figure  1:  Local  improvement  cam  require  ~  n 2  work 

1  Introduction 

A  local  minimum  of  a  matrix  is  a  cell  with  value  less  tham  or  equal  to  those  of  its  neighboring 
cells.  How  hard  is  it  to  find  a  locad  minimum  of  an  n  x  n  matrix?  It  takes  zero  computation  to 
determine  that  one  exists,  since  any  global  minimum  is  surely  one.  And  any  local  minimum, 
once  found,  cam  be  verified  in  0(1)  time.  On  the  other  hand,  local  improvement,  the  most 
natural  search  method,  cam  require  time  fl(n2),  the  sarnie  order  as  enumeration.  For  example, 
suppose  a  locail  minimum  is  sought  for  a  matrix  with  a  descending  spiral,  illustrated  for 
n  =  8  in  Figure  1  (M  is  some  large  value  such  as  n2).  Any  local  improvement  algorithm 
must  traverse  a  path  along  that  spiral  out  to  the  unique  local  optimum  in  the  corner.  But 
the  length  of  this  path  will  be  0(n2)  for  most  starting  points  in  the  square. 

The  large  gap  between  the  obvious  ft(l)  lower  bound  amd  0(n2)  upper  bound  is  charac¬ 
teristic  of  the  local  optimization  problem.  In  this  paper  we  narrow  the  gap  between  these 
bounds  considerably.  There  are  two  natural  neighborhoods  to  consider:  (i)  in  the  King  ad¬ 
jacency,  the  8  neighbors  of  a  paricular  cell  are  those  a  king  could  move  to  from  that  cell  on 
a  chessboard;  (ii)  in  the  Grid  adjacency  the  4  neighbors  of  a  cell  are  the  two  adjacent  in  the 
same  row  amd  the  two  adjacent  in  the  same  column.  We  will  investigate  both  neighborhoods 
here.  Our  best  strategies  for  the  two  adjacencies  turn  out  rather  differently. 

First  we  summarize  our  principal  results:  let  r(n)  equal  the  minimum  number  of  matrix 
lookups  required  by  any  valid  algorithm  that  finds  a  local  optimum  of  a  square  n  x  n  matrix. 
Then 


2 n  <  r(n)  <  3n  +  O(logn)  (King  Adjacency) 
y/2 n  <  r(n)  <  2.554n  (Grid  Adjacency) 

Our  best  strategy  for  the  grid  adjacency,  which  yields  the  2.554n  upper  bound,  is  fairly 
complicated.  It  is  interesting  that  the  matrix,  perhaps  the  simplest  natural  discrete  structure 
for  local  optimization,  is  not  very  straightforward  to  solve. 

In  the  rest  of  this  section  we  review  necessary  background  on  search  procedures  for  local 
optima,  and  apply  it  to  our  specific  case  of  a  matrix.  The  next  sections  develop  the  results 
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stated  for  the  King  and  Grid  adjacencies,  respectively.  We  conclude  in  Section  4  with  some 
remarks  and  conjectures. 

1.1  Divide- And-Conquer 

We  seek  a  strict  local  optimum  of  an  n  x  n  matrix  A  of  distinct  numbers  A(i,  j).  (All  results 
apply  to  the  slightly  more  general  problem  of  seeking  a  nonstrict  local  optimum  when  the 
A(i,j)  values  are  not  necessarily  distinct.)  Equivalently,  form  a  graph  G  =  (V,E)  of  the 
matrix  as  follows:  take  the  cells  of  A  as  the  nodes  V  of  G ,  and  take  the  edge  set 

E  =  [{(i,j),  (*',/)}  :  max(|i  —  i'|,  \j  -j'|)  =  1) 

for  the  king  adjacency;  and 

E  *  [{(*,»,  (*',;')}  :  I*  -  *'l  +  l>  -  j'l  =  1] 

for  the  grid  adjacency.  Then  we  seek  a  local  optimum  of  the  function  A  on  the  graph  G.  As 
in  [4], [2], [5],  our  computational  model  employs  an  oracle  to  compute  the  values  of  A.  A  call 
to  the  oracle  is  a  query,  the  total  number  of  queries  is  taken  as  the  computational  effort. 

We  now  summarize  necessary  background  regarding  local  optima  on  graphs,  from  [4]. 
Results  are  in  terms  of  finding  a  local  minimum,  without  loss  of  generality. 

The  following  dividc-and-conquer  method  will  find  a  local  minimum  of  A: 

1.  Query  the  vertices  in  a  separating  set  5  of  G,  finding  a  vertex  v  6  S  with  minimum 
A(v).  (Where  a  separating  set  is  a  collection  of  vertices  which  disconnects  the  graph.) 

2.  Query  the  vertices  in  N(v),  i.e.  those  adjacent  to  v.  If  v  is  a  local  minimum,  stop. 
Otherwise  proceed  to  Step  3. 

3.  Select  w  6  N(v)  with  A(u;)  <  A(v);  replace  G  by  the  connected  component  of  G  \  S 
containing  w,  return  to  Step  1. 

Virtually  all  the  work  in  the  algorithm  occurs  in  Step  1,  where  the  vertices  of  the  sepa¬ 
rating  sets  5  are  queried.  The  best  separators  are  found  by  solving  the  following 

Separation  Game 

Input:  Graph  G  =  (V,  E). 

Two  players:  Minimizer  I,  Maximizer  II. 


Description:  Player  I  removes  vertices  from  G  until  it  is  disconnected.  Player  II  selects 
one  of  the  newly  created  components  to  call  G ,  discards  the  other  components,  and 
passes  the  new  G  back  to  I.  The  game  ends  when  |Vj  <  1. 
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Step  1:  i  =  0,  K°  =  V;score(G)  =  0. 

Step  2:  If  \V\  <  1  STOP. 

Step  3:  Player  I  chooses  S'  C  V'  such  that  G'  \  S'  is  not  connected  or  is  the  empty  graph; 
score(G)  =  score(G)  +|S'|. 

Step  4:  Player  II  selects  G'+1,  a  connected  component  of  G'  \  S';  i  :=  i  +  1;  Go  to  Step  2. 

The  value  of  the  separation  game  on  G,  denoted  v(G),  is  score(G)  when  each  player  plays 
optimally,  I  to  minimize  and  II  to  maximize.  Let  K  denote  the  number  of  separating  sets 
used  by  player  I,  and  let  A max(G)  denote  the  maximum  degree  of  any  vertex  in  the  graph  G. 
The  principal  result  we  employ  is 

Theorem  1.1.1  Any  algorithm  to  find  a  local  minumum  of  G  requires  at  least  v(G)  queries; 
the  Divide- and-Conquer  method  requires  at  most  v(G)  +  KX max(G)  queries. 

1.2  Implications 

The  value  of  K  in  Theorem  1.1.1  is  typically  logarithmically  small,  and  for  our  graph  of  the 
matrix  the  maximum  degree  Ama,(G)  =  8  or  4.  Therefore,  the  implication  of  Theorem  1.1.1 
is  to  transform  our  problem  into  an  analysis  of  the  separation  game  on  G. 

Solving  the  separation  game  is  unfortunately  NP- Complete  in  general  but  we  can  employ 
the  following  partial  characterization  ([4]): 

Lemma  1.2.1  In  the  separation  game  S'  can  always  be  taken  to  be  a  minimal  separating 
set  of  G. 

The  minimal  separating  sets,  in  turn,  are  partially  characterized  in  the  following  Lemma 
by  an  interesting  dual  relationship  between  the  two  adjacencies: 

Lemma  1.2.2  A  minimal  separating  set  for  the  separation  game  under  the  king  adjacency 
must  be  connected  with  respect  to  the  grid  adjacency;  a  minimal  separating  set  under  the  grid 
adjacency  must  be  connected  with  respect  to  the  king  adjacency. 

Proof:  Let  5  be  a  minimal  separating  set  of  G  =  (V,  E).  S  separates  some  set  U  C 
V;U  fl  5  =  <f>;U  ^  <f>  from  V  —  S  —  U  in  the  graph  G.  The  set  U  may  be  taken  to  be 
connected  for  if  not  we  can  replace  it  with  any  connected  component.  Here  “connected” 
means  under  the  adjacency  for  which  a  local  optimum  is  sought. 

Now  5  must  contain  B(U ),  the  boundary  of  U,  i.e.  S  2  B(U)  =  {v  6  V  :  v  &  U,  3u  € 
U ,  (u,  u)  €  E}  for  otherwise  there  would  be  a  path  from  U  to  some  vertex  in  V  —  S  —  U  that 
did  not  pass  through  5.  But  also  B(U)  is  a  separating  set,  thus  5  =  B(U )  by  the  minimality 
of  5. 

We  also  may  take  U  to  be  topologically  simple.  If  U  is  not  simple,  there  are  two  cases: 
(i)  if  G  contains  a  vertex  not  encircled  by  U  and  not  in  B{U ),  then  let  V  be  U  together  with 
all  vertices  encircled  by  U  (thus  including  some  members  of  B(U)).  In  this  case,  B(V)  is 
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Figure  2:  Cell  c  and  its  king  and  grid  boundaries 

strictly  contained  in  B(U)  and  therefore  5  =  B(U)  was  not  minimal.  Otherwise,  (ii)  there 
must  exist  a  nonempty  connected  component  U'  of  G,  encircled  by  U .  Then  U'  is  simple  by 
induction,  therefore  B(U')  C  B(U),  and  so  we  can  replace  U  by  U'. 

It  remains  to  show  that  when  U  is  connected  under  the  king  (respectively  grid)  adjacency, 
then  B(U)  is  connected  with  respect  to  the  grid  (respectively  king)  adjacency.  The  idea  is 
demonstrated  in  Figure  2. 

The  boundary  of  a  cell  under  the  king  adjacency  is  connected  with  respect  to  the  grid 
adjacency;  the  boundary  of  a  cell  under  the  grid  adjacency  is  connected  with  respect  to 
the  king  adjacency.  For  a  formal  proof,  we  employ  this  observation  in  an  induction  on 
\U\.  Remove  c,  the  rightmost  of  the  uppermost  cells  of  U .  Referring  to  Figure  2,  cell 
1,...,4.  By  induction,  the  boundary  of  U  —  c  is  connected  as  claimed.  Again 
referring  to  Figure  2,  (5, 6, 7, 8)  C\U  ^  <f>  (king  adjacency);  (6,8)  fl  U  <f>  (grid).  When  c 

is  added  to  U  —  c,  the  boundary  gains  all  neighbors  of  c  not  in  U,  and  loses  c  since  U  is 
connected.  Checking  all  the  possible  cases,  it  is  generally  easy  to  see  that  if  B(U  -  c)  is 
connected  as  claimed,  so  is  B(U).  The  only  non-trivial  cases  occur  when  the  removal  of 
c  disconnects  the  boundary.  For  example  (king  adjacency),  if  6  €  U,7  &  U,8  6  U,  then 
5  €  B(U),  7  €  B(U),  and  it  is  possible  that  5  and  7  are  only  connected  through  c.  But  then 
U  is  not  simple,  and  we  have  a  contradiction.  □ 

Before  proceeding  with  the  analysis,  it  is  interesting  to  see  what  upper  and  lower  bounds 
can  be  derived  directly  from  known  results.  In  [4]  it  is  shown  (Corollary  4.11)  that  a  local 
optimum  for  any  planar  graph  may  be  found  in  13.35\/n  +  i)  queries.  Since  A  =  8  or 
4  here,  the  logarithmic  term  is  negligible,  and  we  can  take  13.35\/n  as  an  upper  bound  on 
the  necessary  number  of  queries,  for  both  the  king  and  grid  adjacencies.  For  a  lower  bound, 
we  appeal  to  the  following  results  (Corollary  4.5  in  [4]  and  Theorem  11  in  [3],  respectively): 

Theorem  1.2.1  For  any  graph  and  integer  t, 

v(G)  >  min{t,maxmin{|B(5)|  :  k  - 1  <  |S|  <  fc}}. 

Theorem  1.2.2  Let  G  =  (V,E)  be  an  n  x  n  grid  graph,  and  let  A  C  V  satisfy  \V\/Z  < 
\A\  <  2| V|/3.  Then  |£(A)|  >  n/3. 

Theorem  1.2.2  applies  to  the  king  adjacency,  as  well,  because  all  edges  in  the  grid  graph 
are  edges  in  the  king  graph.  Letting  t  =  n/3  and  considering  k  —  \V\/2  =  n2/ 2,  we  find 
that  n/3  is  a  lower  bound  on  the  number  of  queries  needed  to  find  either  a  king  or  grid  local 
optimum.  Thus  we  have 
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Theorem  1.2.3  Let  r(n)  denote  the  least  number  of  queries  required  by  a  valid  algorithm 
to  find  a  local  optimum  in  a  matrix  ( king  or  grid  adjacency).  Then 

n/3  <  r(n)  <  13.35n. 

We  sharpen  these  bounds  considerably  in  the  following. 


2  The  King  Adjacency 

We  consider  the  problem  of  finding  a  local  optimum  of  a  matrix  where  the  neighborhood 
structure  is  defined  by  the  king  adjacency.  By  Lemma  1.2.2,  a  minimal  separating  set  here 
must  be  connected  with  respect  to  the  grid  adjacency.  We  call  any  such  set  a  region  of  the 
matrix.  Whenever  we  consider  a  region  of  a  matrix,  we  will  think  of  it  as  being  embedded 
within  a  sufficiently  large  square  matrix.  For  any  element  in  this  embedding  structure  that 
is  not  in  the  original  region,  define  its  distance  to  the  region  as  the  length  of  the  shortest 
path  (using  grid  adjacency)  to  the  region.  Then  give  each  of  these  embedding  entries  value 
equal  to  its  distance  +  n2.  The  easiest  way  to  think  of  this  is  to  think  of  dropping  the  region 
into  the  embedding  structure  and  hence  the  values  of  the  surrounding  region  will  be  strictly 
larger  than  the  values  within  the  region  and  will  gradually  climb  as  one  moves  further  away 
from  the  region.  This  will  prove  useful  later  when  we  approximate  the  indices  of  a  matrix 
to  be  queried  and  may  by  chance  query  an  entry  of  a  region  which  does  not  exist. 

2.1  Upper  Bounds 

Our  divide- and-conquer  algorithms  will  have  two  major  types  of  steps:  query  and  check.  For 
ease  of  presentation  we  first  define  these  steps  and  give  the  parameters  for  each.  Then  we 
present  each  procedure,  first  in  words,  and  then  using  these  generic  steps.  We  also  give  a 
pictorial  view  of  each  procedure. 

A  query  step  takes  as  input  a  description  of  a  region  of  A  and  gives  as  output  the 
minimum  entry  in  that  region.  This  step  requries  a  number  of  queries  equal  to  the  size  of 
the  input  set.  This  input  will  be  given  in  one  of  two  ways: 

column  set  (called  a  Column  Query):  a  pair  made  up  of  a  column  index,  j,  and  a  pair 
of  row  indices,  (*o,*i)  with  io  <  i\-  Here  the  query  step  should  be  performed  over  rows 
*o,  *o  +  1»  •  •  •  >  in  column  j.  (We  will  use  the  notation  Column  Query  (j,  (io,  *i):  a)  where 
the  output  of  the  query  is  a.) 

row  set  (called  a  Row  Query):  a  pair  made  up  of  a  row  index,  i,  and  a  pair  of  column 
indices,  (jo,ji)  with  jo  <  j\.  Here  the  query  step  should  be  performed  over  columns  jo,  jq  + 
1,  - . . ,  Ji  in  row  *• 

A  check  step  takes  as  input  an  entry  in  the  matrix  A  and  gives  as  output  the  smallest 
element  among  the  input  and  its  neighbors. 

Our  procedure  will  take  as  input  matrix  A  and  a  range  of  rows  and  a  range  of  columns, 
and  give  as  output  a  local  minimum  of  A  within  the  given  ranges. 


6 


Figure  3:  Illustration  of  Procedure  Row-Column 
Procedure  Row-Column  (Rows(l,n)  Columns(l,n):  am) 

This  algorithm  is  the  divide-and-conquer  algorithm  defined  in  Section  I  with  a  specific, 
natural  choice  of  separators.  The  first  separator  is  the  central  column  of  the  matrix.  If 
the  minimum  of  this  column  is  not  a  local  optimum  then  it  is  assumed  without  loss  of 
generality  that  the  left  neighbor  is  smaller,  and  the  next  separator  is  the  central  row  of 
the  left  submatrix  of  A.  If  more  separators  are  needed,  the  procedure  is  repeated  on  the 
remaining  square  submatrix  (taken  without  loss  of  generality  to  be  the  upper  left  submatrix 
of  A.) 

Step  1:  Column  Query  ([y],(l,n) :  a1) 

Step  2:  Check  (a1  :  a1) 

Step  3:  If  a1  =  a1  then  STOP:  a*  =  a1.  Otherwise,  without  loss  of  generality  a1  =  a,,; 
and  o1  =  Otj-x* 

Step  4:  Row  Query  (|f|,(l,  [fj  -  1) :  a2) 

Step  5:  Check  (a3  :  a3) 

Step  6:  If  a3  =  a3  then  STOP:  a*  =  a3.  Otherwise,  without  loss  of  generality  a2  =  al<} 
and  a2  =  a,_ i,j. 

Step  7:  Procedure  Row-Column  (Rows(l,  |"yj  -  1)  Columns(l,  [j]  -  1):  a*) 


Theorem  2.1,1  Procedure  Row-Column  finds  a  local  minimum  of  an  n  x  n  matrix  in  less 
than  3 n  +  O(logn)  queries. 
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Proof:  Let  /(n)  be  the  number  of  queries  that  this  procedure  requires  for  an  n  x  n  matrix. 
Then  clearly,  f(n)  =  n  +  [f]  + 12  +  /([y]  -  1).  This  leads  to  the  solution  f(n)  =  n  +  2(|  + 
£  +  .••)  +  O(logn)  which  converges  to  3 n  +  O(logn).  □ 

Corollary  2.1.2  A  local  optimum  of  an  m  x  n  matrix,  with  m  <  n,  can  be  found  in  less 
than  m( 2  +  a)  +  n/( 2“)  +  O(logn)  <2m  +  n  +  O(logn)  queries,  where  a  =  [log2  n/mj . 

Proof:  Slightly  altering  Procedure  Row-Column  to  always  bisect  the  longer  direction  (and 
hence  use  the  lesser  number  of  queries)  in  place  of  alternating  between  column  and  row 
queries  gives  this  result  immediately.  □ 

2.2  Lower  Bounds 

We  now  find  lower  bounds  for  the  number  of  queries  needed  to  find  a  local  minimum  of  an 
n  x  n  matrix.  The  main  result  of  this  section  is  a  lower  bound  of  2n  queries. 

In  the  discussions  that  follow  it  will  be  helpful  to  have  the  notion  of  the  top,  bottom,  left 
and  right  of  a  region  of  a  matrix.  Suppose  that  R  is  a  region  within  n  x  n  matrix  A. 

Define 

a  =  min{i|(i,  j)  €  R  for  any  j,  1  <  j  <  n} 
b  =  min{j|(a,  j)  6  fl} 
c  =  max{j|(a,»  6  R} 


d  =  max{ij(t,  j)  €  R  for  any  j,  1  <  j  <  n} 

e  =  min{j|(d,e)  6  i?} 

/  =  max{;|(rf,/)€i*} 

Then  define  the  following  “corners”  of  region  R: 

UL  (“upper  left”) 

=  (a,  6) 

UR  (“upper  right”) 

=  (a,  c) 

LL  (“lower  left”) 

=  We) 

LR  (“lower  right”) 

=  (d,f) 

Now,  in  order  to  define  the  sides,  consider  the  region  R  and  define  an  entry  of  R  to 
be  interior  if  it  has  four  grid  neighbors  within  R  and  frontier  otherwise.  We  will  think  of 
traveling  along  the  frontier  entries  from  one  corner  to  another  in  the  clockwise  direction 
(using  the  grid  adjacency  to  define  this  path).  The  collection  of  frontier  entries  that  one 
encounters  while  traveling  from  UL  to  UR,  inclusive,  is  called  the  top,  those  met  while 
traveling  from  UR  to  LR ,  inclusive,  is  call  the  right,  those  hit  while  in  transit  from  LR  to 
LL,  inclusive,  is  the  bottom,  and  finally  the  others,  that  set  lying  on  the  path  from  LL  to 
UL,  inclusive,  is  the  left.  It  should  be  clear  that  if  R  is  the  whole  matrix  A  then  the  top  is 
row  1,  the  right  is  column  n,  the  bottom  is  row  n,  and  the  left  is  column  1.  It  will  not  hurt 
our  arguments  to  have  an  entry  in  more  than  one  side. 

Now,  let  the  minimum  diameter  of  R,  MinD(R),  be  the  length  of  a  simple  grid- connected 
left-right  path  (i.e.  a  path  of  matrix  entries  from  any  element  of  the  left  to  any  element  of 
the  right)  of  minimum  length  in  R,  where  the  length  of  a  path  is  measured  by  the  number 
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of  entries  in  the  path.  Analogously  define  the  maximum  diameter,  MaxD(R).  Then,  let 
the  minimum  height,  MinH(R)  be  the  length  of  a  simple  grid- connected  top-bottom  path 
of  minimum  length  in  R;  and  analogously  define  the  maximum  height,  MaxH(R). 

We  will  define  a  strategy  for  player  II,  the  maximizer  in  the  Separation  Game,  to  get  a 
lower  bound  on  v(G). 

Player  II  Strategy 

Before  Player  I  plays,  we  define  R  to  be  the  subset  of  matrix  A  consisting  of  all  unqueried 
entries.  By  definition  of  separation  here,  this  forms  a  region.  Therefore,  after  Player  I’s  turn, 
the  unqueried  entries  form  two  (disconnected)  regions,  say  Ri  and  R2.  If  one  of  these  /?,,  i  =  1 
or  2  touches  all  four  sides  (top,  bottom,  left  and  right)  of  R,  then  choose  that  component, 

Ri • 

Otherwise,  choose  that  component  among  Ri  and  R2  which  has  the  largest  maximum  of 
its  maximum  diameter  and  maximum  height.  That  is,  let  d{  =  max{MaxD(Ri),  MaxH(Rx )}, 
and  let  im  =  argmax{di}.  Then  choose  component  R^>. 

Lemma  2.2.1  It  requires  at  least  n  queries  to  find  a  local  minimum  of  an  n  x  n  matrix. 

Proof:  Let  the  sequence  of  regions  chosen  by  Player  II  be  given  by  A  =  RP,  Rl, . . . ,  Rk. 
Then,  using  the  strategy  above,  for  some  j,  1  <  j  <  k,  region  R}  will  not  touch  all  four 
sides  of  region  R]~l  (since  at  the  end  this  is  true).  Hence,  either  the  top  and  bottom  of 
R3~l  are  disconnected  or  the  left  and  right  of  R3~l  are  disconnected  by  queried  elements  (or 
both).  Without  loss  of  generality,  assume  that  the  left  and  right  are  disconnected.  Then, 
by  piecing  together  the  earlier  queries,  the  left  and  right  of  the  original  matrix,  A ,  are  also 
disconnected.  Hence,  there  exists  a  path  from  the  top  to  the  bottom  of  A  made  up  of  queried 
entries.  Clearly,  these  entries  alone  have  used  n  queries.  □ 

Theorem  2.2.1  It  requires  at  least  2n  —  1  queries  to  find  a  local  minimum  of  an  n  x  n 
matrix. 

Proof:  First  note  that  the  theorem  is  true  in  the  case  of  n  =  1.  Consider  the  above  strategy 
for  Player  II.  Suppose,  without  loss  of  generality  (as  guaranteed  by  Lemma  2.2.1),  that 
eventually  Player  I  queries  a  top-bottom  path.  Consider  the  first  time  such  a  path  has  been 
queried  (i.e.,  this  is  the  first  time  Player  II  must  choose  a  component  that  does  not  touch 
all  four  sides).  Suppose  that  up  to  this  time  n  +  H  queries  have  been  made.  If  H  >  n,  then 
clearly  the  theorem  is  proved.  So  suppose  that  H  <  n.  Now,  in  the  chosen  component,  there 
exists  a  square  of  side  length  equal  to  the  minimum  of  the  minimum  diameter  and  minimum 
height  of  the  component.  How  can  this  value  be  made  as  small  as  possible?  By  using  all 
of  the  extra  H  queries  to  shorten  one  of  them,  say  the  minimum  diameter.  This  is  done  by 
using  all  of  these  queries  in  “horizontal”  queries,  and  centering  them  so  that  the  minimum 
diameter  is  exactly  Hence,  in  the  chosen  component,  there  exists  a  square  matrix  of 
size  at  least  x  By  Lemma  2.2.1  above,  this  requires  at  least  queries.  Thus,  the 
total  number  of  queries  so  far  is  at  least  n  +  H  +  2^*  =  1.5n  +  .bH.  Now,  “bootstrapping” 
with  this  result  in  place  of  the  bound  given  in  the  lemma  gives  that  the  enclosed  square 
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requires  at  least  1.5  (2^)  and  hence  the  total  lower  bound  is  1.75n  +  .25 H.  Iterating  this 
procedure  gives  a  lower  bound  of  2n  queries.  □ 

Now  we  give  an  alternative  way  to  arrive  at  the  same  lower  bound  of  2n  queries.  We 
include  this  because  the  method  is  quite  different  and  we  believe  it  provides  some  additional 
insight  into  the  geometry  of  the  problem. 

First  we  need  the  following  lemma.  In  this  method,  it  is  best  to  think  of  the  matrix,  A , 
as  being  placed  in  the  %2  plane  in  the  following  way.  Each  entry  takes  up  a  unit  square,  so 
that  the  outer  edge  of  the  left  of  A  is  the  y-axis,  the  outer  edge  of  the  bottom  of  A  is  the 
x-axis,  the  outer  edge  of  the  right  of  A  is  the  line  x  =  n  and  the  outer  edge  of  the  top  of  A 
is  the  line  y  —  n.  Then,  given  any  region  R,  its  area  Area(R)  is  the  actual  (continous)  area 
of  the  enclosed  region;  its  perimeter  Per(R)  is  the  sum  of  the  euclidean  lengths  of  all  the 
straight  lines  that  make  up  the  outer  edges  of  the  frontier  of  the  region. 

Lemma  2.2.2  For  any  region,  R, 

yjArea(R)  ^  \ 

Per(R)  ~  4 

Proof:  First  consider  a  rectangle  with  width  w  and  length  w  +  8,  where  8  >  0.  Then 
Per(R)  =  2uj  +  2(w  +  £)  and  hence  (Per(R))2  =  16tx;2  +  4 S2  +  16n;6.  Further,  Area(R)  = 
w(w  +  8),  so  16(Area(R))  =  16u;2  +  16tu$.  Since  8  >  0,  it  is  clear  then  that  (Per(R))2  > 
16 Area(R)  so  the  lemma  follows  for  rectangles. 

Now  consider  any  region  R.  Let  the  tightest  circumscribing  rectangle  be  T.  It  is  clear 
that  Area(T )  >  Area(R).  Hence  it  is  sufficient  to  prove  that  Per(T)  <  Per(R)  since  then 
we  will  have  (Per(R))2  >  ( Per(T ))2  >  l6Area(T)  >  16i4rea(i2).  Think  of  traveling  around 
the  boundary  of  R  and  notice  that  this  boundary  will  agree  with  the  boundary  of  T  at  least 
for  some  stretch  on  each  side  of  T  (top,  bottom,  left  and  right).  Where  the  boundary  of  R 
differs  from  the  boundary  of  T,  call  it  a  journey.  Any  journey  either  originates  and  ends  on 
the  same  side  or  else  it  originates  on  one  side  and  ends  on  an  adjacent  side  of  T.  We  will 
consider  each  of  these  cases  separately. 

1.  Suppose  the  journey  originates  and  terminates  on  the  same  side.  Without  loss  of 
generality,  suppose  it  is  the  top  or  bottom.  Consider  the  origin  as  a  point  in  the  R2 
plane,  (a,  6).  Then  the  terminus  is  another  point  (c,6).  Without  loss  of  generality 
assume  that  c>  a.  Then  the  amount  of  the  perimeter  of  T  between  these  two  points 
is  exactly  c  —  a.  The  amount  of  the  perimeter  of  R  that  lies  between  these  points  is  at 
least  c  —  a  since  the  boundary  follows  the  grid  adjacency  (it  might  also  have  a  vertical 
component  and  hence  could  be  greater). 

2.  Now  without  loss  of  generality,  suppose  that  the  journey  originates  on  the  left  and  ends 
on  the  top.  Then  the  origin  is  some  point  (a,  b)  and  the  terminus  is  another  point  (c,  d). 
We  know  that  d  >  b.  Then  the  perimeter  of  T  between  these  points  is  (d  —  b)  +  |(c  —  a) | 
and  the  perimeter  of  R  must  be  at  least  this  by  the  same  reasoning  as  above  (it  must 
travel  at  least  d  —  b  vertical  distance  and  at  least  \c  —  a\  horizontal  distance). 
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Since  the  two  regions  clearly  have  the  same  perimeter  between  journeys,  this  completes  the 
proof.  □ 

Now  consider  the  region  R  before  Player  I  queries  during  some  iteration  of  the  separation 
game.  The  queries  that  follow  before  another  turn  of  Player  II  can  be  combined  into  some 
separating  set,  C.  This  causes  R  to  be  divided  up  into  two  regions,  Ri  and  R2.  For 
ease  of  presentation,  call  Area(R)  =  A,  Per(R)  =  P  and  similarly,  Area(R^)  =  A,  and 
Per{Ri)  =  P{  for  i  =  1,2.  We  will  abuse  nutation  slightly  and  use  C  also  to  represent  the 
number  of  entries  in  the  set  C.  We  now  need  the  following  result. 


Lemma  2.2.3  Let 


and  suppose  that 


A\  =  A  A  for  some  0  <  A  <  1. 


Then, 


Pi-XP  <  2A C. 


Proof:  First  we  will  need  the  following  inequality:  Px  +  P2  <  2C  +  P.  To  see  this  rigorously, 
we  need  the  following  notation.  Let  the  frontier  of  region  Ri  for  i  —  1,2  be  denoted  F,.  Let 
Fi  be  the  (not  necessarily  disjoint)  union  of  F,|  and  Fl2,  where  Ftl  is  the  part  of  F,  which 
is  adjacent  to  the  frontier  of  C,  and  F,2  is  the  part  of  F,  which  is  a  subset  of  the  frontier  of 
R.  We  can  break  up  Pi  f  P2  into  the  part  which  arises  from  tracing  along  Fx2  U  F22  and  the 
part  which  comes  from  tracing  along  F\\  U  F2X.  The  first  part  is  clearly  less  than  or  equal  to 
P.  Our  goal  is  therefore  to  show  that  the  second  part  is  at  most  2 C.  To  this  end,  C  can  be 
assumed  to  be  a  minimal  separating  set  by  Lemma  1.2.1,  whence  simple  case  analysis  verifies 
that  no  cell  of  C  has  more  than  2  sides  adjacent  to  Fn  U  F2X.  Lemma  1.2.1  also  ensures  that 
C  has  no  interior.  Therefore,  the  second  part  is  at  most  2 C  and  the  inequality  holds. 

Thus, 

2  C  +  P>PX  +  P2 


and  so 


This  implies  that 


Pi  +  P2  -2C  <  F. 


Pi-XP  < 
< 


Px  -  A (Px  +  P2-  2 C) 

(1  -  A)F!  -  AF2  +  2AC. 


So,  if  we  can  show  that 


then  the  lemma  is  proved. 


A P2  >  (1  -  A )F, 
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By  assumption, 


ft  * 


XA  ^  ( 1  —  A)^4 

Pj  _  Pj 

=>  A  P2A  >  (1-A  )APi 
=>  AP2  >  (1  —  A)PX> 

□ 

Now  we  are  ready  for  our  main  result. 

Theorem  2.2.2  The  number  of  queries  needed  to  find  a  local  optimum  in  a  region  of  area 
A  and  perimeter  P  is  at  least 


Proof:  The  proof  is  inductive  on  A.  First  note  that  if  A  =  1  then  it  must  be  that  P  =  4, 
and  clearly  it  requires  exactly  one  query  to  solve  the  problem,  so  the  result  holds. 

In  general,  it  is  sufficient  to  show  that 

f  A  1  A 

where  as  in  Lemma  2.2.3,  max  {  f  =  -=r-  By  Lemma  2.2.2  we  know  that 

<=1.2  l  Pi  J  r\ 

p>aVa 

and  that  _ 

Pi  >  4/4^  =  \VAX. 

So,  P  x  Pi  >  164\/A.  This  implies  that  Cy/X(P  x  Pi)  >  164CA.  Rearranging  this  gives 

CVA  ^  2CA 

84  “  P  x  Pi' 

Using  Lemma  2.2.3  gives  the  righthand  side  is 

^  Pi  -  AP  1  A 

~  PxPx  ~  P  Pi ' 

So, 

Cy/x  A  AX  _  A  4, 

8  ~  P  Pi  ~  P  Pi' 


Hence 


^<Cs/A  +  *£. 


We  know  that  A  <  1  so  the  result  follows.  □ 

Corollary  2.2.3  The  number  of  queries  needed  to  find  a  local  minimum  of  an  nxn  matrix 
is  at  least  2 n. 

Proof:  For  an  n  x  n  square  4  =  n2  and  P  =  4n.  Using  this  in  Theorem  2.2.2  above  gives 
the  result  immediately.  □ 
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3  The  Grid  Adjacency 

In  this  section  we  consider  the  problem  of  finding  a  local  optimum  of  a  matrix  where  the 
neighborhood  structure  is  defined  by  the  usual  adjacency  in  grids.  By  Lemma  1.2.2,  a 
minimal  separating  set  need  only  be  connected  with  respect  to  the  king  adjacency.  Thus  in 
this  section  a  region  of  a  matrix  will  be  a  subset  of  entries  so  connected.  As  in  Section  2 
we  will  continue  to  think  of  our  matrix  as  being  embedded  within  a  larger  square  matrix. 
Of  course,  we  must  update  our  definition  of  distance  to  be  consistent  with  king  adjacency 
connected  paths  here. 

3.1  Upper  Bounds 

Of  course,  Procedure  Row-Column  can  still  be  used  here  since  any  set  of  entries  that  is  con¬ 
nected  with  respect  to  grid  adjacency  will  also  be  connected  with  respect  to  king  adjacency. 
Hence  we  immediately  get  an  upper  bound  of  3 n  +  0(log  n)  on  the  number  of  queries  needed. 
Here,  though,  we  can  show  that  it  is  not  optimal.  To  improve  on  it,  we  employ  diagonal 
queries.  The  intuition  here  is  that  while  the  euclidean  length  of  a  diagonal  of  an  n  x  n  matrix 
is  n\J 2,  there  are  only  n  entries  in  the  matrix  on  the  diagonal,  so  diagonal  queries  are  more 
efficient  by  a  factor  of  \fi. 

In  this  section  we  will  need  one  further  way  to  call  a  query  step: 

line  segment  (called  a  Line  Query):  a  pair  made  up  a  line  definition,  bx  +  cy  =  d,  and 
a  range  on  x,  x0  <  x  <  xx.  Here  it  should  be  interpreted  that  the  matrix,  A,  is  placed  in 
the  TV  plane  with  the  (n,  1)  entry  at  the  origin  and  the  (l,n)  entry  at  the  (1, 1)  position  in 
the  plane.  The  query  should  be  performed  at  the  intersections  of  the  matrix  and  the  line 
segment. 

Notice  that  the  Row  and  Column  Queries  can  be  described  as  special  cases  of  the  Line 
Query.  However,  for  technical  reasons  we  leave  them  with  their  own  descriptions. 

To  make  the  following  procedures  easier  to  understand,  we  must  first  take  care  of  rotated 
matrices.  We  will  call  a  square  matrix  that  has  been  rotated  45  degrees  a  diamond  matrix. 
Let  A  —  [a,j]  be  our  n  x  n  matrix  and  let  B  =  [6{;]  be  the  inscribed  diamond  matrix.  Note 
that  B  has  euclidean  side  lengths  equal  to  n/y/ 2,  but  when  counting  queries  the  side  length  is 
effectively  only  n/2.  For  this  reason  we  will  refer  to  B  as  an  j  x  |  diamond  matrix  inscribed 
in  the  n  x  n  matrix  A.  For  ease,  let  n  be  even.  Take  bn  =  a^,i,  b\^  =  a^,  b»iX  =  an,^,  and 
6at ^  =  ajt„.  Then  to  find  a  local  minimum  of  5,  we  will  perform  Procedure  Row-Column 
on  it,  but  querying  the  appropiate  diagonal  segments  of  A  in  place  of  rows  and  columns. 
The  input  will  be  the  matrix  A ,  but  it  is  understood  that  only  the  elements  of  B  must  be 
known,  and  the  check  queries  should  only  be  performed  over  elements  of  B. 

Procedure  Diamond  (Rows(l,n)  Columns(l,n):  a*) 

Step  1:  Line  Query  (i  +  y  =  l,|<i<  | :  a1) 

Step  2:  Check  (a1  :  a1) 
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Step  3:  If  a1  =  a1  then  STOP:  a*  =  a1.  Otherwise,  without  loss  of  generality  a1  =  a,j 
and  a1  =  a^x  or  al+lii. 

Step  4:  Line  Query  (x  =  y,  |  <  z  <  |  :  a2) 

Step  5:  Check  (a2  :  a2) 

Step  6:  If  a2  =  a2  then  STOP:  a*  =  a2.  Otherwise,  without  loss  of  generality  a2  =  atJ 
and  a2  =  a,+i,j  or  aij+x. 

Step  7:  Procedure  Diamond(  Rows(fj]  +  l,n)  Columns([j],  :  am) 

Corollary  3.1.1  Procedure  Diamond  finds  a  local  minimum  of  an  j  x  |  diamond  matrix  in 
less  than  1.5n  +  O(logn)  queries. 

Proof:  This  follows  immediately  from  Theorem  2.1.1  and  the  discussion  above  on  diamond 
matrices.  □ 

Corollary  3.1.2  A  local  minimum  ofanmxn  diamond  matrix,  with  m  <  n,  can  be  found  in 
less  than  m(2  +  a+n/(2am))  +  0(logn)  <  2m+n  +  0(logn)  queries,  where  a  =  [log3n/mJ. 

□ 

To  make  the  presentation  easier,  when  we  call  Procedure  Diamond  we  will  refer  to  this 
more  general  form  of  Corollary  3.1.2  which  can  take  as  input  am  oblong  diaunond  matrix. 
We  will  cadi  the  procedure  by  giving  ats  input  the  four  corners  of  the  matrix  rather  than  the 
rows  and  columns  of  the  embedding  square  (or  rectauagular)  matrix. 

One  laist  result  we  need  before  we  can  use  this  as  a  subroutine  is  the  principle  of  con- 
taunment:  If  one  region,  B,  is  a  subset  of  another  region,  A,  then  it  cam  require  no  more 
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queries  to  find  a  local  minimum  of  B  than  to  find  one  of  A.  This  is  clear,  since  to  find  a 
local  optimum  of  B  we  could  just  consider  B  to  be  embedded  within  A  as  discussed  above 
(of  course  A  in  turn  is  embedded  within  another  large  matrix),  and  then  find  a  local  opti¬ 
mum  of  A.  We  will  sometimes  call  a  procedure  on  a  row  and  column  set  that  imply  that  the 
whole  diamond  matrix  does  not  exist  (it  would  require  rows  or  columns  with  negative  indices 
or  indices  greater  than  n).  When  we  do  this  we  are  actually  relying  on  this  containment 
principle,  and  are  considering  the  existing  region  to  be  embedded  within  the  called  diamond 
matrix. 

Procedure  Diagonal  (Rows(l,n)  Columns(l,n):  a*) 

This  algorithm  first  queries  and  checks  along  the  NE-SW  diagonal  of  the  square  matrix. 
Failing  to  find  a  local  optimum  here,  it  queries  and  checks  along  half  of  the  NW-SE  diagonal. 
Then  if  it  still  hasn’t  found  a  local  optimum  it  queries  a  diagonal  paralled  to  the  NE-SW  line 
halfway  down  the  triangle.  After  this  it  is  either  left  with  another  triangle  which  it  treats 
with  Procedure  Diamond,  or  it  forms  a  diamond  and  a  triangle  out  of  the  resulting  shape. 
Each  of  these  can  be  taken  care  of  with  Procedure  Diamond. 

Step  1:  Line  Query  (x  =  y,0  <  x  <  1  :  a1) 

Step  2:  Check  (a1  :  a1) 

Step  3:  If  a1  =  a1  then  STOP:  a*  =  a1.  Otherwise,  without  loss  of  generality  a1  =  atJ 
and  a1  =  1(J  or  a,-j_ j. 

Step  4:  Line  Query  (x  +  y  =  1,0  <  z  <  |  :  a2) 

Step  5:  Check  (a2  :  a2) 

Step  6:  If  a2  =  a2  then  STOP:  a*  =  a2.  Otherwise,  without  loss  of  generality  a2  =  a,j 
and  d2  =  or  a,  J+1. 

Step  7:  Line  Query  (y  =  z  +  |,0  <  z  <  ^  :  a3) 

Step  8:  Check  (a3  :  a3) 

Step  9:  If  a3  =  a3  then  STOP:  a*  =  a3.  Otherwise, 

Case  1:  o3  =  o<^  and  a3  =  a^_i  or  then  Procedure  Diamond  ((0, 1),(0,  j), 

<i.  !):«•) 

OR 

Case  2:  a3  =  a,tJ  and  a3  =  o,J+1  or  Oj+1j,  then  go  to  Step  10. 

Step  10:  Line  Query  (y  =  |  —  x,0  <  z  <  ^  :  a4) 

Step  11:  Check  (a4 :  a4) 

Step  12:  If  a4  =  a4  then  STOP:  am  =  a*.  Otherwise, 
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Case  1:  a4  =  aitj  and  a 4  =  a,j_x  or  a,+i.j,  then  Procedure  Diamond  ((0,  5), 

(0,0):  a*) 

OR 

Case  2:  a4  =  aitj  and  a4  =  alji+1  or  a,_u,  then  Procedure  Diamond  ((0,  ^),  (i,f), 

(J»i)»  (M):  <*’) 


Theorem  3.1.3  Procedure  Diagonal  finds  a  local  optimum  of  an  n  x  n  matrix  in  less  than 
2.75 n  +  O(log  n)  queries. 

Proof:  This  procedure  terminates  in  a  Procedure  Diamond  iteration  in  either  Case  1  of 
Step  9  or  Case  1  or  2  of  Step  12.  We  will  consider  each  of  these  in  turn.  The  Procedure 
Diamond  iteration  of  each  of  these  steps  has  as  input  a  region  within  an  j  x  ~  diamond 
matrix  (within  an  |  x  |  square  matrix)  and  hence  by  Corollary  3.1.1  requires  no  more  than 

+  O(logn)  queries.  The  number  of  queries  up  to  Step  9  is  n  +  j  +  -  +  12.  Hence  if  the 
procedure  terminates  here  then  the  total  number  of  queries  is  less  than  2.5n  -f  <3 (log  n).  If, 
however,  the  procedure  reaches  Step  12  then  it  already  has  performed  n  +  j  +  J  +  ^  +  16 
queries  and  hence  the  total  number  is  less  than  2.75n  +  0(log  n)  if  the  termination  occurs 
in  either  Case  1  or  2  of  Step  12.  □ 

The  problem  with  this  algorithm  is  that  it  is  not  “balanced.”  That  is,  one  sees  that  if 
termination  occurs  in  Step  9  then  the  total  number  of  queries  is  significantly  less  than  if 
termination  occurs  in  Step  12.  Intuitively,  we  should  balance  the  regions  to  be  explored,  so 
that  the  number  of  queries  is  approximately  the  same  regardless  of  where  the  algorithm  is  led. 
The  place  where  we  have  the  leeway  to  do  this  in  Procedure  Diagonal  is  when  we  choose  the 
third  query  (Step  7).  It  was  rather  arbitrary  that  we  decided  to  query  the  halfway  diagonal. 
With  this  in  mind  we  now  introduce  a  “generic”  or  parameterized  algorithm  that  leaves  as 
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parameters  where  the  third  (and  later)  diagonals  should  be  queried.  Then  we  optimize  over 
possible  values  of  these  parameters  in  order  to  balance  the  resulting  regions  and  so  minimize 
the  total  number  of  queries. 

We  will  need  one  subroutine  that  we  haven’t  seen  yet,  a  procedure  to  deal  with  triangles 
that  doesn’t  use  Procedure  Diamond  directly.  Instead,  this  algorithm  iterates  the  ideas 
within  Procedure  Diagonal  with  the  parametric  argument  described  above.  Its  input  will  be 
the  three  corners  of  the  triangle.  It  is  assumed  that  the  triangle  is  an  isosceles  right  triangle. 
Procedure  Triangle  ((0,1),  (0,0),  (5,5):  a") 

This  algorithm  first  queries  the  diagonal  that  is  a  of  the  way  down  the  triangle  (Procedure 
Diagonal  uses  a  =  |).  This  diagonal  divides  the  triangle  into  a  triangle  and  a  trapezoid.  If 
the  diagonal  does  not  turn  up  a  local  optimum  then  the  algorithm  will  either  iterate  on  the 
new  triangle  portion  or  it  needs  to  deal  with  the  trapezoid.  Recall  that  Procedure  Diagonal 
took  care  of  the  trapezoid  by  dividing  it  into  a  triangle  and  a  diamond.  Here,  it  first  cuts  it 
into  two  with  a  NW-SE  diagonal  0  of  the  way  from  the  first  cut  (Procedure  Diagonal  uses 
0  =  |).  Now,  since  0  doesn’t  necessarily  equal  5,  this  diagonal  cut  divides  the  trapezoid 
into  a  region  that  can  be  handled  by  Procedure  Diamond  and  another  trapezoid.  This  new 
trapezoid  is  again  divided  by  a  NW-SE  diagonal,  this  time  7  of  the  way  down.  Finally,  this 
results  in  two  regions,  one  of  which  can  be  taken  care  of  with  Procedure  Diamond  and  the 
other  by  iterating  Procedure  Triangle. 

Step  1:  Line  Query  (y  =  x  +  (I  —  a),0  <  x  <  f  :  a1) 

Step  2:  Check  (a1  :  a1) 

Step  3:  If  a1  =  a1  then  STOP:  a"  =  a1.  Otherwise, 

Case  1:  a1  =  a f>,  and  a1  =  a,j_1  or  then  Procedure  Triangle  ((1,0),  (0, 1  —  a), 
(a,l  -  a):  a*) 

OR 

Case  2:  a1  =  a< j  and  a1  =  or  Oj+tj,  then  go  to  Step  4. 

Step  4:  Line  Query  (y  =  —  x  +  (1  -  2/3),  -0  +  |  <  x  <  1  -  0  :  a2) 

Step  5:  Check  (a2  :  a2) 

Step  6:  If  a2  =  a2  then  STOP:  a*  =  a2.  Otherwise, 

Case  1:  a2  =  a,j  and  a3  =  a,J+1  or  a, -ij,  then  Procedure  Diamond  ((§-/?,  l-f-/3), 
(a,  1  -  a),  (5  -  0,  \  -  0),  (5,  5):  am) 

OR 

Case  2:  o2  =  otJ  and  o2  =  o,j_  1  or  a,+li>,  then  go  to  Step  7. 

Step  7:  Line  Query  (y  =  -x  +  (1  -  20  -  27),  min(0,  +  j  -  /?  -  7  :  a3) 

Step  8:  Check  (a3  :  a3) 
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Figure  6:  Procedure  Triangle 


Step  9:  If  a3  =  d3  then  STOP:  a*  =  a3.  Otherwise, 

Case  1:  a3  —  a,  j  and  a3  =  a,J+i  or  then  Procedure  Diamond  ((0, 1  —  a), 

OR 

Case  2:  a3  =  a^j  and  d3  =  or  aj+u,  then  Procedure  Triangle  ((0, 1  —  a),  (0,0), 

(§  -0-7»§  -0“7):  a“) 

Now  we  will  use  this  procedure  as  a  subroutine  to  get  a  parameterized  form  of  Procedure 
Diagonal.  This  precedes  exactly  like  Procedure  Diagonal  except  that  once  we  are  left  with 
a  triangle  to  analyze  we  use  Procedure  Triangle  rather  than  the  unparameterized  version 
(which  as  mentioned  above  set  a  and  0  equal  to  £  and  has  no  7.) 

Procedure  Para-Diagonal  (Rows(l,n)  Columns(l,n):  a*) 

As  mentioned  above,  this  procedure  starts  out  like  Procedure  Diagonal,  querying  the 
NE-SW  diagonal  and  then  the  half  NW-SE  diagonal.  Now  it  is  left  with  an  isosceles  right 
triangle  and  so  finishes  by  calling  Procedure  Triangle. 

Step  1:  Line  Query  (y  =  z,0  <  x  <  1  :  a1) 

Step  2:  Check  (a1  :  a1) 

Step  3:  If  a1  =  a1  then  STOP:  a*  =  a1.  Otherwise,  without  loss  of  generality  a1  =  aitJ 
and  a1  =  or  a,tJ_  1. 

Step  4:  Line  Query  (y  =  -x  -f  1,0  <  x  <  i  :  a2) 

Step  5:  Check  (a2 :  a2) 


Figure  7:  Procedure  Para-Diagonal 

Step  6:  If  a2  =  a2  then  STOP:  a*  =  a2.  Otherwise,  without  loss  of  generality  a2  =  a,j 
and  a2  —  or  Oij+i. 

Step  7:  Procedure  Triangle  ((1,0),  (0,0),  (|,j).’  a’) 


Theorem  3.1.4  Procedure  Pam- Diagonal  finds  a  local  optimum  of  an  n  x  n  matrix  in  less 
than  2.5445n  +  O(logn)  queries. 

Proof:  For  this  procedure  to  converge  we  must  restrict  the  values  of  the  various  parameters. 
We  will  always  require  that  a  is  between  j  and  1,  0  is  at  least  |  —  a,  and  7  is  no  more 
than  |  —  a.  To  analyze  this,  we  must  separately  consider  the  four  different  ways  that  this 
algorithm  can  terminate.  These  are: 

1.  If  Case  1  is  always  chosen  in  Step  3  of  Procedure  Triangle; 

2.  At  any  time  Case  1  is  chosen  in  Step  6  of  Procedure  Triangle; 

3.  At  any  time  Case  1  is  chosen  in  Step  9  of  Procedure  Triangle; 

4.  Case  2  is  chosen  in  Step  9  of  Procedure  Triangle. 

First,  it  is  clear  that  the  largest  number  of  queries  will  result  in  any  of  the  three  last  choices 
if  they  occur  at  the  first  possible  time,  i.e.,  the  first  time  that  step  of  Procedure  Triangle  is 
encountered.  Indeed,  the  number  of  queries  will  decrease  the  later  they  are  chosen.  Hence, 
we  will  analyze  the  possibilities  above,  with  options  (2)  -  (4)  understood  to  mean  that  they 
actually  occur  at  the  first  time  that  step  is  encountered  in  the  first  iteration  of  Procedure 
Triangle.  Since  we  wish  to  optimize  over  choices  of  parameters  a,  /?,  and  7,  we  will  first 
symbolically  write  down  the  number  of  queries  in  parametric  form  and  then  discuss  possible 
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values  of  these  parameters.  Note  that  the  first  6  steps  of  Procedure  Para-Diagonal  require 
1.5n  +  8  queries.  Hence,  we  will  only  analyze  Procedure  Triangle  (with  the  inputs  used  in 
Step  7  of  our  procedure)  and  at  the  end  will  add  these  extra  queries. 

1.  Case  1  of  Step  3  is  always  chosen  (Procedure  Triangle  is  iterated):  Here  it  is  straight¬ 
forward  that  the  total  number  of  queries  is  less  than  .5  n  +  0(log  n). 

2.  Case  1  of  Step  6  is  chosen:  The  procedure  Diamond  step  will  require  less  than  [2(  ^  — 
a)  +  /3]n  +  O(logn)  queries  by  Corollary  3.1.2.  The  steps  of  Procedure  Triangle  leading 
up  to  this  step  require  [a + ( |  —  a)]n  +  8  .queries.  Hence  this  termination  option  requires 
less  than  1.5n  +  [—2a  +  0]n  +  O(log  n)  queries. 

3.  Case  1  of  Step  9  is  chosen:  Here  the  Procedure  Diamond  step  will  require  less  than 
[27  +  (j  —  a)]n  +  O(logn)  queries.  The  steps  of  Procedure  Triangle  leading  up  to 
this  termination  option  require  [a  +  ( |  —  a) +  (5  —  0  —  7)]n  +  12  queries.  Hence  this 
termination  option  requires  less  than  1.5n  -i-  [—a  —  0  +  7]n  +  O(logn)  queries. 

4.  Case  2  of  Step  9  is  chosen:  Here  the  triangle  left  to  analyze  requires  no  more  than 
T(j  ~  /?  — 7]n  queries  where  rkn  is  the  number  of  queries  needed  by  Procedure  Triangle 
on  an  isosceles  right  triangle  with  side  length  kn.  The  steps  of  Procedure  Triangle 
leading  up  to  this  step  are  as  given  in  (3)  above.  Hence  this  termination  option 
requires  less  than  n  -  0n  -  771  +  r[|  -  0  —  7]n  +  0(log  n)  queries. 

As  an  example,  consider  the  values  of  a  =  5,  0  =  j  and  7  =  |.  The  reader  can  easily 
verify  that  these  satisfy  our  convergence  requirements  stated  above.  These  values  give  the 
following  results: 

1.  l.On  +  O(logn) 

2.  1.08n  +  O(logn) 

3.  1.08n  +  O(logn) 

4.  Using  rn  =  1.25n  from  Procedure  Diagonal  (the  number  of  queries  less  the  first  two 
query  and  check  steps)  above,  gives  .70n  +  0(log  n) 

Using  these  values,  we  would  have  an  upper  bound  of  2.58n  +  0(log  n)  queries  for  the 
whole  problem.  Notice  that  we  have  almost  accomplished  complete  balancing  of  the  regions 
here.  It  is  probably  too  much  to  ask  for  to  also  balance  the  last  triangular  region.  Next,  we 
discuss  how  to  pick  good  values  for  a,  0  and  7. 

In  order  to  optimize  the  parameters  a,  0 ,  and  7,  we  will  write  them  each  in  terms  of 
.5 r.  What  we  will  show  is  that  the  smallest  possible  .5 r  we  can  get  using  this  procedure  is 
between  1.044  and  1.045. 
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Since  we  are  trying  to  balance  the  different  regions,  what  we  will  do  is  set  each  of  the 
results  (1)  -  (3),  above  equal  to  .5 nr.  Then  we  will  require  that  the  quantity  in  (4)  remains 
no  more  than  .5 nr.  We  are  suppressing  the  check  steps  and  their  0(log  n)  terms  for  clarity. 
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■5n[-R 

— 

.5  TIT 
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a 

f  -5r  ] 

r 

[  1+2(.5t)  J 
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l  2(.Sr)+l  J 

(3) 

[1.5  -  a-  0  +  f]n 

= 
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[  2(.5r)+l  J 

(4) 

1  -  0  -  7  -  t[.5 

c~ 

1 

1 

< 

.5  r 

and 

.5r 

> 

0 

This  gives  a  third  order  equation  to  solve.  The  solution  is  .5 r  =  1.0445.  Note  that  we 
must  also  make  sure  that  our  convergence  ranges  on  the  parameters  are  enforced  (.25  <  a  < 
.5,  0  >  .5  —  a,  and  7  <  .5  —  a).  Checking  these  with  the  above  gives  the  information  that 
1  <  .5t  <  1.075.  Hence  we  are  within  the  necessary  bounds.  Using  this  value  of  ,5r  =  1.0445 
gives 


a  2  .33813532 
0  2  .22077064 
7  3  .10340596 

As  these  satisfy  all  of  our  requirements,  this  is  the  solution.  The  total  number  of  queries  for 
the  n  x  n  matrix  is  less  than  [1.5  +  1.0445]n  +  O(logn)  =  2.u-»-»dn  +  O(logn),  completing 
the  proof  of  Theorem  3.1.4.  □ 

3.2  Lower  Bounds 

Here  notice  that  we  can  not  use  the  results  in  Section  2.2  directly  since  more  sophisticated 
query  sets  may  be  chosen  by  the  Player  I  with  this  adjacency  structure.  However,  from  our 
discussion  on  diagonals  in  Section  3.1,  Theorem  2.2.1  does  immediately  give  the  following 
result. 

Theorem  3.2.1  It  requires  at  least  n\/2  queries  to  find  a  local  minimum  of  an  nxn  matrix. 
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2.5445n 

Figure  8:  Queries  to  find  a  local  optimum  in  an  n  x  n  square 

4  Conclusions 

4.1  Conjectures 

The  bounds  found  in  Sections  2  and  3  are  summarized  in  Figure  8  (logarithmic  terms  are 
disregarded  here).  We  have  substantially  improved  on  the  bounds  of  Theorem  1.1.1,  but 
a  gap  persists  for  both  adjacencies.  In  particular,  we  conjecture  a  lower  bound  of  3n  for 
the  king  adjacency  (this  would  imply  3/>/2  for  the  grid).  Neither  proof  of  Theorem  2.2.1 
applies  directly  to  this  conjecture.  The  first  proof,  at  the  least,  would  require  a  new  strategy 
for  Player  II.  (Against  the  given  strategy,  player  I  can  achieve  close  to  2n  by  separating 
an  (n  —  2)  x  (n  —  2)  non-centered  square  from  the  rest  of  the  n  x  n  square.)  To  prove 
the  conjecture  with  the  second  method,  one  would  show  that  at  least  12 A/ P  queries  are 
required  to  find  a  local  optimum  in  a  region  of  area  A  and  perimeter  P.  However,  this  is 
false:  consider  a  1  x  \/2  rectangle.  By  Corollary  2.1.2,  a  local  minimum  can  be  found  in 
2  +  >/2  queries.  Now  A  =  y/2,  P  =  2y/2+2;  so  (2+2 y/2)P/A  =  (2  +  \/2)J  =  11.656. . .  <  12. 
We  do  conjecture  that  at  least  8\/2 A/ P  queries  are  required  to  find  a  local  optimum  under 
the  king  adjacency.  We  also  conjecture  a  lower  bound  of  2.5n  for  a  square  matrix  under  the 
grid  adjacency. 

Since  completing  an  earlier  draft  of  this  paper,  we  have  found  that  Althofer  and  Koschnick 
[1]  have  independently  studied  the  problem  of  local  optimization  on  an  m-dimensional  grid. 
For  m  =  2,  (our  grid  adjacency  case),  their  results  reduce  to 

4nn+1  =  ^  ~  -^  +  <>(*)  <  r(n)  <4 n  +  O(logn). 

It  would  be  interesting  to  see  if  Theorems  3.1.4  and  3.2.1  could  be  extended  to  m  =  3  or 
more  dimensions  to  strengthen  the  bounds  in  [1]. 

4.2  Related  problems 

Local  saddlepoints  are  related  to  local  optima  with  the  grid  adjacency.  We  say  a  point  {i,j) 
is  a  local  saddlepoint  iff 

A(i  ±  1,  j)  <  A(i,j)  <  A{i,j  ±  1), 

*.e.,  A(i,j)  is  larger  than  its  two  horizontal  grid  neighbors  and  smaller  than  its  two  vertical 
grid  neighbors.  Unlike  local  optima,  local  saddlepoints  need  not  exist.  Finding  them  turns 
out  to  be  more  costly,  as  the  following  theorem  shows. 

Theorem  4.2.1  Any  valid  local  saddlepoint  algorithm  requires  at  least  nm/4  queries  in  the 
worst  case  for  an  n  x  m  matrix. 
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Figure  9:  Adversary’s  matrix 


saddlepoint 

local  optimum 

row-column 

d(n) 

grid 

W)  1 

9(n) 

Figure  10:  Comparison  between  grid  and  row-column  adjacencies 

Proof:  We  play  the  adversary  against  an  arbitrary  valid  algorithm.  We  let  A  be  made  of 
identical  2x2  submatrices  as  shown  in  Figure  9  Each  blank  cell  will  have  value  either  3  or  7, 
but  we  do  not  decide  which  until  it  is  queried.  With  this  strategy,  none  of  the  fixed  cells  can 
be  a  local  saddlepoint,  and  each  unfixed  cell  is  a  local  saddlepoint  iff  its  value  is  3.  Now,  as 
the  algorithm  makes  queries,  we  respond  with  the  fixed  value  for  fixed  cells,  and  with  7  for 
the  unfixed  cells,  until  the  last  unfixed  cell  is  queried.  Then  we  randomly  decide  on  either 
3  or  7.  Obviously  the  algorithm  must  query  all  nm/4  unfixed  cells  to  determine  whether  or 
not  a  local  saddlepoint  exists.  □ 

Corollary  4.2.2  It  requires  0(n2)  queries  to  find  a  local  saddlepoint. 

Proof:  Obviously  there  exists  a  valid  0(n2)  algorithm,  and  the  result  follows.  □ 

In  a  broader  context,  Theorem  4.2.1  displays  a  nice  asymmetry  between  grid  and  row- 
column  adjacencies.  In  the  row-column  adjacency,  a  cell  is  adjacent  to  all  other  cells  in  its 
row  or  column.  That  is,  the  graph  has  edge  set 

{(*,;),  (*',/)}  6  E  min{|i  -  *'|,  I j  -  j'\}  =  0. 

A  (row-column)  saddlepoint  is  the  largest  in  its  row  and  smallest  in  its  column.  Bounds  for 
the  different  adjacencies  are  displayed  in  Figure  10.  For  the  grid  adjacency,  saddlepoints  are 
more  costly  to  find;  but  for  the  row-column  adjacency,  local  optima  are  more  costly.  It  would 
be  interesting  to  find  a  general  reason  for  the  opposing  behavior  of  the  two  neighborhoods. 
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Abstract 

In  this  paper  we  introduce  a  class  of  lattice  polyhedra,  called  2-Lattice 
polyhedra.  Examples  of  2-Lattice  polyhedra  include  bipartite  matching 
polyhedra,  the  intersection  of  two  integral  polymatroids,  the  connected 
polyhedron  of  an  undirected  graph,  and  the  perfectly  matchable  subgraph 
polytope  of  a  bipartite  graph.  We  show  that  the  maximum  cardinality  of 
a  vector  in  a  2-Lattice  polyhderon  is  equal  to  the  minimum  capacity  of 
a  cover.  Special  cases  of  this  result  include  Konig's  Theorem,  Menger’s 
Theorem,  Dil worth’s  Theorem,  and  Edmonds’  Theorem  for  cardinality 
matroid  intersection  and  polymatroid  intersection.  We  show  that  the 
collection  of  minimum  covers  contains  an  upper  semi-lattice.  For  special 
classes  of  2-Lattice  polyhedra,  called  Matching  2-Lattice  polyhedra,  we 
provide  a  characterization  of  the  largest  member  in  the  family  of  nested 
covers  in  terms  of  maximum  cardinality  vectors  in  the  polyhedron. 
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Introduction 


Let  L  be  a  finite  set  of  elements  (called  lines)  and  let  T  be  a  finite  lattice  with 
partial  order  (I\  ■<)  which  induces  meet  operation  A  and  join  operation  v.  Let 
0  :  T  *-»  Z  be  submodular  and,  for  each  element  £  €  L,  let  at  :  T  •-»  Z  be 
supermodular.  Given  S  €  T  and  x  €  let  a(S)x  =  £(a<(S)x(£)  :  t  €  L). 
Then 

(x  €  :  a(S)x  <  0(S)  for  each  S  €  T}  (1.1) 

is  a  lattice  polyhedron.  Lattice  polyhedra  were  introduced  by  Hoffman  and 
Schwartz  [15]  and  further  studied  by  Johnson  [16],  Hoffman  [13],  Groflin  and 
Hoffman  [11],  and  Grishuhin  [10].  We  investigate  a  special  class  of  lattice  poly¬ 
hedra  we  call  2-Lattice  polyhedra. 

Here  we  consider  those  lattice  polyhedra  in  which  we  allow  T  to  be  infinite, 
but  require  a  finite  bound  on  the  length  of  chains  in  I\  This  ensures  that  T  is 
a  complete  lattice  and  includes,  for  example,  the  lattice  of  linear  subspaces  of 
a  finite  dimensional  vector  space.  We  further  require  that  for  each  £  e  L,  at  is 
not  only  supermodular,  but  also  non-decreasing  and  maps  T  into  (0, 1, 2}.  The 
set 

P(a,  0)  =  {x  e  Rl?  :  a(S)x  <  0(S)  for  each  S  €  T}, 

is  called  a  2-Lattice  polyhedron  and  each  vector  x  6  P(a,  0)  is  called  a  2-Lattice 
vector.  Examples  of  2-Lattice  polyhedra  include  bipartite  matching  polyhedra 
[14,  18],  the  intersection  of  two  integral  polymatroids  [8],  the  connected  polyhe¬ 
dron  of  an  undirected  graph  [12],  and  the  perfectly  matchable  subgraph  polytope 
of  a  bipartite  graph  [1]. 

A  cover  is  a  pair  (5,T)  of  (possibly  identical)  members  of  T  such  that 
q<(S)  +  a/(T)  >  2  for  each  £  €  L. 

A  cover  may  also  be  a  single  element  T  of  T  (we  denote  this  kind  of  cover  by 
(*,  T))  such  that 

a/(T)  >  2  for  each  £  €  L. 

The  capacity  of  a  cover  (5,T),  denoted  0(S,T),  is 

l/2\0(S)  +  0(T)] 

while  the  capacity  of  a  cover  (*,T),  denoted  0(*,T)  is 

1/2  0{T). 

In  this  paper  we  consider  the  relationship  between  the  problem  of  finding  a 
maximum  cardinality  2-Lattice  vector: 
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maxy^z(l) 

eeL 

s.t  a(S)x  <  0(S)  for  each  S  €  T  (1.2) 

x  >  0 


and  the  dual  problem: 

min£y(S)/3(S)- 

ser 

s.t.  ^  y(S)at(S)  >  1  for  each  i€  L  (1.3) 

ser 

y  >  0 

We  show  that  the  maximum  cardinality  of  a  2-Lattice  vector  is  the  minimum 
capacity  of  a  cover.  Special  cases  of  this  result  include  Konig’s  Theorem  [17], 
Menger’s  Theorem  [20],  Dilworth’s  Theorem  [6],  and  Edmonds’  Theorem  for 
cardinality  matroid  intersection  and  polymatroid  intersection  [8].  We  also  show 
that  the  set  of  minimum  covers  contains  an  upper  semi-lattice. 

This  paper  focuses  on  the  relationships  between  the  the  linear  programs 
(1.2)  and  (1.3),  not  on  the  integrality  of  extreme  solutions  to  (1.2).  We  refer  to 
H/et  *(0  83  t*ie  “cardinality"  of  a  vector  x  even  though  *  may  not  be  integral. 
In  fact,  we  only  establish  the  half-integrality  of  extreme  points  of  (1.2).  In  many 
cases,  such  as  bipartite  matching  and  matroid  intersection,  the  polyhedron  is 
known  to  have  integral  extreme  points,  whereas  in  others,  most  notably  non- 
bipartite  matching,  the  extreme  points  are  not  integral,  but  the  polyhedron  does 
have  Chvital  rank  1  [5].  Our  ultimate  purpose  is  to  characterize  those  2-Lattice 
polyhedra  that  share  this  rank  1  property. 

Vande  Vate  [25]  has  already  shown  that  2-Lattice  polyhedra  have  half¬ 
integral  extreme  points  and  that  their  extreme  points  correspond  to  extreme 
points  of  related  non-bipartite  matching  problems.  Unfortunately,  this  cor¬ 
respondence  by  itself  is  not  enough  to  ensure  that  2-Lattice  polyhedra  have 
Ch vital  rank  1.  Thus,  we  turn  to  the  relationship  between  a,  0  and  the  convex 
hull  of  integral  2-Lattice  vectors. 

All  of  the  examples  of  2-Lattice  polyhedra  relate  a  and  0  in  some  way.  We 
capture  these  relationships  with  the  following  general  conditions.  First,  let  6  be 
a  (possibly  infinite)  set,  and  let  L  be  a  finite  subset  of  2£  (generally  chosen  to  be 
a  collection  of  pairs  from  £).  We  also  require  that  T  include  the  empty  set  and 
be  partially  ordered  by  set  containment.  In  this  way,  we  may  associate  with  each 
set  S  C  £  the  smallest  member,  a{S),  of  T  containing  S.  We  further  require  0 
to  be  normalized,  non-decreasing  and  satisfy  0(a{e})  =  1  for  each  c  €  £ ,  and 
0(o{l})  =  2  for  each  l  6  L.  Finally,  we  model  the  relationship  between  a  and 
0  via  the  condition  Q/(S)  =  0(p(l)  A  S)  for  each  l  €  L  and  S  e  T.  It  is  easy  to 
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Figure  1.1:  Example 


see  that  at  is  normalized  and  non-decreasing.  It  is  also  straightforward  to  prove 
(see  [25])  that  ae  is  supermodular.  We  call  the  resulting  2-Lattice  polyhedra 
Matching  2-Lattice  polyhdera. 

When  T  is  the  family  of  all  subsets  of  a  finite  set  £  and  L  is  a  partition  of  £ 
into  pairs  we  refer  to  the  Matching  2- Lattice  polyhedra  P(a,  0)  as  an  incidence 
2-Lattice  polyhedron  (note  that  in  this  setting  ,  at  :  T  — ►  {0,1,2}  is  defined 
by  at(S)  =  |S  n  l|).  Integral  incidence  2-Lattice  polyhedra  include  bipartite 
matching  polytopes  [14,  18],  network  flow  polyhedra  [9],  and  the  intersection  of 
two  matroids  [8].  Incidence  2- Lattice  polyhedra  have  also  been  studied  in  the 
context  of  non-bipartite  matching  [7]. 

Example  1  Consider  the  cycle  matroid  of  the  graph  shown  in  Figure  1.  The 
partition  L  is  given  by  L  »  {lu^2,t^,U,^a,^}^  where  ti  =  {(0, 1),  (1, 5)}, 
t2  =  {(0,2),  (2, 5)},  l3  =  {(0,3),  (3, 5)},  U  -  {(0,4),  (4,5)},  /5  =  {(6,7),  (6,8)}, 
*«  =  {(7,8),  (8, 9)}. 

Then,  P(a,  0)  is  the  set  of  z  €  /^  satisfying 

Xi  <  1  for  »  =  1,2, ...,6 

2z*  +  2xj  <  3  for  i,j  6  {l,...,4},i^  j 

2z*  +  2ij  +  2x*  <  4  for  i,j,  k  €  {1, . . .  ,4},  t  #  j  ^  k 

2xi  +  2xj  +  2x3  +  2x4  ^  5 
2x3  +  x«  <  2 

Despite  the  significant  successes  to  date,  the  formulation  of  a  combinatorial 
problem  via  an  incidence  2-Lattice  polyhedron  is  not  always  the  best  available 
to  us.  We  can,  for  instance,  improve  the  incidence  formulation  in  Example  1  via 
the  following  matroid  formulation.  When  0  is  the  rank  function  of  a  matroid 
M  defined  on  £,  L  is  a  partition  of  £  into  pairs,  and  T  is  the  lattice  of  flats  or 
closed  subsets  in  M,  we  refer  to  P{ a,  0)  as  a  matroid  2-Lattice  polyhedron. 

Example  2  Consider  once  again  the  graph  of  Figure  1.  The  flats  of  the  cycle 
matroid  of  the  first  connected  component  consisting  of  lines  l\ ,  t2,  l3  and  t4  are: 
the  empty  fiat;  single  elements;  pairs  of  elements;  pairs  of  lines;  sets  of  three 
elements,  one  from  each  of  three  lines;  sets  of  one  element  from  each  line;  sets 
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of  three  lines;  and  {It,  £2,  £3,64}.  The  flats  of  the  cycle  matroid  of  the  second 
connected  component  consisting  of  lines  £5  and  £$  are:  the  empty  flat;  single 
elements;  sets  consisting  of  (8,9)  with  another  element;  {(6, 7),  (6, 8),  (7, 8)}; 
and  {£5,  1%).  The  flats  of  this  matroid  are  the  combinations  of  two  sets,  one 
from  each  connected  component.  Under  the  matroid  formulation  P[a,0)  is  the 
set  of  x  €  /?+  satisfying 

ii  <  1  for  t  =  1,2, ...,6 

2ij  +  2xj  <  3  for  i,j  €  {1,...,4},»#  j 

2zi  +  2zj  +  2x*  <  4  for  i,  j,  k  6  (1 . 4},  i  ^  j  £  k 

2xi  +  2xj  +  2x3  +  2i4  <  5 
*5  +  x«  <  1 

Note  that  this  formulation  has  the  same  integral  solutions  as  that  of  Example 
1,  but  has  cut  off  all  extreme  points  with  X5  =  £  and  xe  —  1-  For  example,  it 
has  cut  off  the  extreme  points  (0, 0, 0, 0,  3, 1)  and  (3,  3,  3, 3,  3, 1). 

When  the  matroid  is  linear  and  a  representation  is  available,  we  can  do  still 
better  than  the  matroid  formulation  via  the  following  linear  formulation.  Let 
A  be  a  rational  matrix  and  let  V  denote  the  linear  subspace  spanned  by  the 
columns  of  A.  If  L  is  a  collection  of  pairs  of  columns  of  A,  T  is  the  lattice  of 
linear  subspaces  of  V  and,  for  each  S  €  T,  0[S)  denotes  the  linear  rank  of  5 
then  we  refer  to  P( a,  0)  as  a  linear  2-Lattice  polyhedron. 

Example  3  We  once  again  consider  the  graph  of  Figure  1.  Under  the  linear 
Matching  2- Lattice  formulation,  P{ a,  0)  is  given  by  the  set  of  x  € 

Xl  +  X2  +  X3  +  X4  <  1 

Xs  +  Xf  <  1 

Notice  that  this  is  in  fact  the  convex  hull  of  integral  solutions  to  the  poly¬ 
hedron  defined  in  Example  1. 

Section  2  gives  notation  and  preliminaries.  The  first  main  result  of  this  pa¬ 
per,  the  min- max  theorem  for  2-Lattice  polyhedra,  is  proved  in  Section  3.  This 
section  also  shows  that  the  family  of  minimum  covers  of  a  2-Lattice  polyhedra 
contains  an  upper  semi-lattice.  In  Section  4,  we  show  the  second  main  result  of 
this  paper:  a  characterization  of  the  largest  member  in  the  family  of  nested  min¬ 
imum  covers  for  Matching  2-Lattice  polyhedra  in  terms  of  maximum  cardinality 
Matching  2-Lattice  vectors. 

2  Preliminaries 

In  this  section  we  define  notation  and  present  some  background  results. 
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Definition  1  Given  x  €  ft+  1  and  a  subset  S  of  L,  define 

i/<€S 

Is(/)  "  \  0  otherwise 

Definition  2  For  S,  T  6  T,  the  rank  of  T  contract  S,  denoted  0(T/S),  is 
defined  to  be  0(T  v  5)  -  0(S). 

In  a  linear  matroid,  contraction  corresponds  to  orthogonal  projection. 

Definition  3  A  collection  of  sets  is  called  an  upper  semi-lattice  if  it  is  closed 
under  a  join  operation. 

Definition  4  The  support  of  vector  ut  6  ft",  denoted  by  supp(w),  is  the  set 
{i  €  [1,2 . n]  :  tc4  >  0}. 

Given  a  2-Lattice  vector  x,  the  collection  of  members  in  T  tight  with  respect 
to  x  is  denoted  by  T(x)  =  {5  6  T  :  a(S)x  =  0(S)}.  The  following  lemma  shows 
that  T(x)  is  a  sublattice  of  I\ 

Lemma  2.1  Let  x  be  a  2-Lattice  vector  and  suppose  S  and  S’  are  in  T(x),  then 
SV  S'  and  S  A  S'  are  in  T(x). 

Proof. 


0(S  V  S')  +  (3(S  A  S')  >  a(S  V  S')x  +  a(S  A  S')x 

>  o(S)x  +  q(S')x 
-  0{S)+(3{S') 

>  0(S  V  S')  +  0(S  A  S') 


Since  T(x)  is  a  sublattice  of  complete  lattice,  T,  it  has  a  largest  member. 

Definition  5  For  each  2-Lattice  vector  x,  the  largest  member  of  T(x)  is  called 
the  closure  of  x  and  is  denoted  by  d(x). 

The  following  corollary  is  an  immediate  consequence  of  Lemma  2.1  and  will 
prove  useful  in  arguing  that  certain  vectors  x  €  fti^1  are  2- Lattice  vectors. 

Corollary  2.2  Let  x  €  ft*+ 1 1  and  suppose  Z  and  Z'  are  members  of  T  such  that 

a(Z)x  >  0(Z), 
a(Z')x  =  0{Z')  and 
a(Z  A  Z')x  <  0{Z/\Z'), 


then  a (Z  V  Z')x  >  0{Z  V  Z '). 
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Groflin  and  Hoffman  [11]  demonstrated  the  following  property  of  lattice  poly- 
hedra.  (Actually,  Groflin  and  Hoffman  restricted  the  range  of  at  to  (-1,0, 1}. 
Nonetheless,  their  proof  applies  here  as  well.) 

Theorem  2.3  (Groflin  and  Hoffman)  Each  extreme  point  x*  of  a  lattice 
polyhedron 

{x  €  R[+'  :  a(S)x  <  0(S)  for  each  5  €  T} 
is  the  unique  solution  to  a  system  of  linear  equations: 


a(Si)x  =  0(Si)  for  t  =  1, . . . ,  t  and 
x{()  *  0  for  i€  N  C  L, 

where  S  **  (5*  :  i  €  [1, . . . ,  t]}  is  a  chain  in  (r,  *),  i.e.,  Si  -<  Sa  St.  □ 

Using  the  structure  of  the  bases  of  the  fractional  matching  polytope  of  a 
graph,  we  are  able  to  describe  the  structure  of  extreme  2-Lattice  vectors.  In 
particular,  Vande  Vate  (Theorem  2.5  proven  in  [25))  provides  a  mechanism  for 
describing  extreme  2-Lattice  vectors  in  terms  of  perfect  fractional  matchings  of 
graphs. 

Given  a  graph  G  —  (V,  E)  and  an  integer  vector  b  €  R|V'1,  the  perfect  frac¬ 
tional  b-matching  polytope  of  G,  denoted  FP{G,b),  is: 

{x  €  fl1*'  :  2>(*)*(‘)  :  e  6  E)  =  b(v)  for  each  v  6  V}. 

Here,  d«(v)  is  the  degree  of  edge  e  at  vertex  v.  As  the  graph  G  may  have  loops, 
dt(v)  €  {0, 1, 2}  and  as  the  graph  G  may  have  spurs,  £(d.(e)  :  v  6  V)  €  (1, 2}. 
Letting  D  be  the  |  V[  x  |£|  matrix  with  elements  d«( t»),  FP{G,b)  may  be  written 
as: 

FP{G,  b)  -  {*  €  Rlf  :  Dx  -  6} 

Each  vector  x  €  FP(G,  b)  is  a  perfect  fractional  b-matching  (or,  more  briefly,  a 
fractional  matching)  of  G. 

Chen  [4]  (also  Balinsld  and  Spielberg  [2],  Trotter  [24],  Nemhauser  and  Trotter 
[21],  Pulleyblank  [22]  and  Bartholdi  and  Ratliff  [3])  described  the  bases  of  D  in 
terms  of  the  subgraphs  induced  by  the  corresponding  edges  of  G. 

A  subset  T  of  edges  is  a  bloom  if  the  subgraph  induced  by  the  edges  in  T 
is  connected,  contains  exactly  one  cycle  and  that  cycle  has  an  odd  number  of 
edges.  In  the  following  theorem,  each  edge  in  the  graph  G  is  identified  with  the 
corresponding  column  in  the  matrix  D.  If  G  has  spurs,  we  add  a  distinguished 
vertex  called  the  root  incident  to  each  spur  edge. 

Theorem  2.4  (Chen)  Suppose  D  is  the  incidence  matrix  of  a  connected  graph 
G.  A  subset  T  of  columns  is  a  base  of  D  if  and  only  if  T  is  a  maximal  set  of 
edges  such  that  each  component  of  the  subgraph  (V,  T)  is  either  a  tree  or  a  bloom. 
The  component  containing  the  root  must  be  a  tree.  □ 
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In  light  of  Theorem  2.4,  a  set  T  of  edges  in  a  graph  G  is  called  a  base  of  G  if 
the  corresponding  columns  form  a  base  of  the  incidence  matrix  D.  Theorem  2.5 
extends  Theorem  2.4  to  the  2-Lattice  polyhedron  via  the  following  association 
between  extreme  2-Lattice  vectors  and  perfect  fractional  6-matchings. 

By  Theorem  2.3,  each  extreme  2-Lattice  vector  x*  is  defined  by  a  subset  N  of 
L  and  a  family  S  =*  {Si :  1 6  [1, . . . ,  t]}  of  members  of  T  with  St  -<  Sa  -<...-<  St. 
For  ease  of  argument  and  presentation,  we  form  a  new  complete  lattice,  r*  by 
appending  a  new  smallest  element,  *,  to  T  and  defining  0(*)  =  0  and  a <(*)  =  0 
for  each  /  €  L.  (Note  that  as  a  smallest  element  in  I”,  Sa*  =  •  and  5v*  =  5  for 
each  5  €  T.  Further,  since  a<(«)  =  /3(*j  =  0  it  is  clear  that  a*  is  supermodular 
and  0  is  submodular  on  T*). 

The  pair  (5,  N)  induces  a  graph,  denoted  G(S,  L  \  N),  defined  as  follows. 
For  each  5*  €  5,  there  is  a  vertex  Si  in  G(S,  L \  N)  and  for  each  line  £  €  L\N 
there  is  an  edge  t  in  G(S,  L  \  N).  Let  So  =  *.  The  edge  (  is  incident  to  vertex 
Si  if  ctt(Si)  -  Q/(Si_i)  =  1  and  is  loop  at  vertex  Si  if  a<(Si)  -  a*(Si_i)  =  2. 

Theorem  2.5  (Vande  Vate)  A  2-Lattice  vector  x*  is  extreme  if  and  only  if 
there  is  a  subset  N  of  L  and  a  family  S  —  {St  :  i  6  [1, . . . ,  t]}  of  members  of  T 
with  Si  -<  Sj  -<...-<  St  such  that 

1.  x*(0  a*  0  for  each  i€  N, 

2.  L\N  is  a  base  of  G(S,  L  \  N),  and 

3.  The  projection  of  x*  onto  the  components  indexed  by  lines  in  L\  N  is 
the  unique,  perfect  fractional  b-matchmg  in  G(S,L  \  N),  where  b(Si)  = 
0(Si)  -5(Si_i)  for  each  i  €  (l,...,t). 

Corollary  2.6  Each  extreme  2-Lattice  vector  is  half-integral. 

3  A  Min-Max  Formula 

Theorem  3.1  develops  a  min- max  formula  for  the  maximum  cardinality  of  a 
2-Lattice  vector.  This  min- max  formula  generalizes  Konig’s  Theorem  [17], 
Menger’s  Theorem  [20],  Dilworth’s  Theorem  [6],  and  Edmonds’  Theorem  for 
cardinality  matroid  intersection  and  polymatroid  intersection  [8]. 

Theorem  3.1  The  maximum  cardinality  of  a  2- Lattice  vector  is  the  minimum 
capacity  of  a  cover. 

Proof.  To  see  that  the  maximum  cardinality  of  a  2-Lattice  vector  is  at  most 
the  minimum  capacity  of  a  cover,  observe  that  for  any  cover  (S,  T),  the  solution 
y(S)  a*  y(T)  **  1/2  is  dual  feasible  and  has  objective  value  0{S,  T). 

To  prove  that  the  maximum  cardinality  of  a  2-Lattice  vector  equals  the 
minimum  capacity  of  a  cover,  we  show  that  there  is  an  optimum  solution  y*  to 
the  dual  problem  such  that: 
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1.  supply*)  forms  a  chain  in  (T,  ■<). 

2.  y*  is  half-integral, 

3.  y’(S)  >  0  for  at  most  two  members  S  €  T. 

First,  to  see  that  there  is  an  optimum  solution  y*  to  the  dual  problem  sat¬ 
isfying  (1)  we  employ  an  argument  similar  to  that  of  Hoffman  and  Schwatrz 
[15],  but  modified  to  accommodate  an  infinite  lattice  r.  Consider  an  optimal 
dual  solution  y  with  finite  support  (e.g.  each  extreme  point  optimal  solution 
has  finite  support).  If  supp(y)  forms  a  chain  in  (r,  <),  we  are  done.  Otherwise, 
define  a  complete  order  V  on  rupp(y)  that  is  consistent  with  the  partial  order 
;<.  We  argue  that  y  can  be  converted  into  a  dual  solution  y*  such  that  supp(y*) 
forms  a  chain  in  (r,  x)  as  follows. 

Let  So  —  *  and  index  the  elements  of  supp(y)  so  that 

5b  V  Si  VS2  V  -  S«. 


Define  t  =  to  be  the  smallest  index  such  that  S4_!  5*  and  j  =  j9  to  be  the 

smallest  index  such  that  Sj  2 i  Si.  Consider  the  dual  solution  y  such  that 


y(S)-€ 
y(S)  + 1 
y(S) 


■itSeiSi.Sj} 

if  Se  {$i  v  sjtSi  a  s*} 

otherwise, 


where  e  »  min{y(5i),y(Sj)}.  Since 


at(Si  V  Sj)  +  a/(St  A  Sj)  >  a  /(Si)  +  at(Sj) 


for  each  line  l  €  L,  y  is  dual  feasible.  Further,  since 


P(Si  v  Sj)  +  0(S4  A  Sj)  <  0{Si)  +  0(Sj), 

£jKS)0(s)<I>(s)0(s). 

ser  ser 

So,  the  (dual)  objective  value  of  y  is  no  worse  than  that  of  y. 

Note  that  the  chain  Sb  i  Si  ^  •  •  •  Si  A  Sj  *  Sj  *  •  •  •  S4_i  in  (r,  ■<)  grows 
with  each  successive  revision  of  this  kind.  Since  there  is  a  finite  upper  bound 
on  the  length  of  any  chain  in  T,  this  process  must  ultimately  terminate  with  a 
dual  solution  y*  such  that  rupp(y')  is  a  chain  in  (r,  x). 

Now,  to  see  that  y*  satisfies  (2),  let  5  =  {S4  :  i  =  l,...,t}  be  a  nested 
family  of  members  of  T  and  N  a  subset  of  L  such  that  y*  is  the  unique  solution 
to  the  system: 

y*  y(S4)a<(54)  =s  l  for  each  te  L\N  (3.4) 

S<€S 
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(3.5) 


Let  y'  be  the  unique  solution  to  the  system: 

^2  y(5’i)(«»/(Si)  -  atiSi-i))  =  1  for  each  t€  L\N 
S<€5 

Then  y'  is  the  unique  solution  to  the  system  yA  =  1,  where  A  is  the  node-edge 
incidence  matrix  of  the  basis  graph  G(S,  L  \  N),  i.e., 

if  there  is  no  path  in  G(5,  L\N)  from  the  root  to  5<, 
if  there  are  an  odd  number  of  edges  on  the  path  in 
G(5, L\N)  from  the  root  to  Sit  and 
if  there  are  an  even  number  of  edges  on  the  path  in 
G{S,  L\N)  from  the  root  to  5*. 

And,  we  may  compute  y*  as  follows: 

y*(5i)  =  y'(Si) -y'(Si+l)  forts  l,...,t-l,  and 
y'(St)  -  y’(St). 

It  follows  immediately  that  y*  is  half-integral. 

Finally,  to  see  that  y*  has  at  most  two  non-zero  components  observe  that 
since  y*  is  dud  feasible,  it  is  non-negative.  Thus,  the  corresponding  vector  y' 
must  be  of  the  form 

{1  for  i  = 

1/2  for  »  =  *i  +  l,...,ta 

0  for  *  =  t2  +  l,  •••,*• 

It  follows  that  y*  has  at  most  two  non-zero  components  and  J2stes  y-(Si)  € 
{1/2, 1}.  If  y*  has  exactly  two  non-zero  components  S  and  T,  then  y*(S)  = 
y*(T)  =  1/2  and  (S,T)  is  a  minimum  cover.  If  y*  has  only  one  non-zero  com¬ 
ponent  S,  then  either  y*(S)  =  1,  in  which  case  (5,5)  is  a  minimum  cover,  or 
y*(5)  =  1/2  in  which  case  (*,5)  is  a  minimum  cover.  □ 


Definition  6  A  cover  (5,  T)  with  S  <T  is  called  a  nested  cover. 

The  following  lemma  shows  that  we  may  associate  a  nested  cover  with  each 
minimum  cover  and  hence  that  there  is  always  a  nested  minimum  cover. 

Lemma  3.2  If  (S,T)  is  a  minimum  cover  then  (5  A  T,  5  V  T)  is  a  nested  min¬ 
imum  cover. 

Proof.  For  each  i  6  L,  a*(5  A  T)  +a/(5vT)  >  a<(5)  +c*t(T)  >  2.  Therefore, 
(5  A  T,  5  V  T)  is  a  cover.  Since  J3(S  V  T)  +  0(S  aT)<  0{S)  4-  0{T),  it  follows 
that  (5  A  T,  5  v  T)  is  a  minimum  cover.  □ 
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Corollary  3.3  The  maximum  cardinality  of  a  2- Lattice  vector  is  the  minimum 
capacity  of  a  nested  cover. 

We  present  Edmond's  duality  theorem  for  cardinality  matroid  intersection 
as  a  special  case  of  Theorem  3.1. 

Corollary  3.4  Let  Mi  be  a  matroid  with  rank  function  and  let  M2  be  a 
matroid  with  rank  function  f2  both  defined  on  the  same  ground  set  E.  Then  the 
maximum  cardinality  of  an  intersection  tn  Mi  and  M2  is 

mnrl(S)+ri(E\S). 

£ 

Proof.  The  matroid  intersection  polyhedron 

P  »  {x  €  Rf  :  x(S)  <  ri(S)  and  x (S)  <  r2(S)  for  each  5  C  E} 
is  equivalent  to  the  2-Lattice  polyhedron 

{x  €  Ri  :  a(S)x  <  0(S)  for  each  SCf} 


where 

•  £  consists  of  two  copies  E  and  E'  of  E, 

•  L  consists  of  the  lines  {e,e'}  with  an  element  from  E  and  its  copy  in  E', 

•  For  each  t  €  L  and  5  £  £,  at(S)  =  |f  n  S|,  and 

•  For  each  5  €  £ ,  0(S)  =  ri(Sn£)  +  rj(S D  E'). 

Thus,  Corollary  3.3  implies  that  the  maximum  cardinality  of  an  intersection 
in  Mi  and  M2  is  the  minimum  capacity  of  a  nested  cover.  Let  (5,T)  be  a 
minimum  capacity  nested  cover.  Define  S\  =  SC\  E  and  S2  =  Sn  E' .  Similarly, 
let  Ti  =  T n  E  and  T2  =*  Tn  E'.  Since  (S, T)  is  a  nested  cover,  if  e  Si,  then 
e'  €  7a  and,  if  e'  S2  then  e  €  Ti.  Thus, 

»"i(5i)  +  r2(E\S\)  <  ri(Si)  +  ^(Ta) 


and 

r\{E\ S2)  +  fa(5a)  <  »"i(Ti)  -H r2(^2). 

It  is  easy  to  establish  that  for  each  x  €  P  and  S  C  E 

£>(«)<  r,(S)  +  r,  (E\S). 

«es 

Since  each  maximum  cardinality  x  €  P  satisfies 

£ *(e)  *  0(S,T)  =  l/2[n(Si)  +  ra(ra)  +  r,(T,)  +  rj(Sa)l. 
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It  follows  that  £eg£x(c)  =  rj(Si)  +  r2(£  \  Si)  =  n(£  \  S2)  +  r2(S2)  and 
hence  that  the  maximum  cardinality  of  an  intersection  in  Mi  and  M2  is  equal 
to  minsc£ri(S)  +  r2(£\S).  □ 

When  the  2- Lattice  polyhedron  is  known  to  have  integral  extreme  points, 
we  may  restrict  attention  to  integer  2- Lattice  vectors  in  Theorem  3.1.  In  the 
case  of  matroid  intersection,  it  is  easy  to  verify  that  for  each  family  S  =  {S*  : 
t  €  (1, ....  t)}  of  members  of  T  with  Si  -<  S2  -<...-<  St,  G(S,  L)  is  bipartite 
and  hence,  as  is  well  known,  the  extreme  points  of  the  matroid  intersection 
polyhedron  are  integral. 

Note  that  the  notation  used  in  the  proof  of  Theorem  3.1  is  consistent  with 
our  construction  of  the  lattice  T*.  Therefore  we  henceforth  refer  to  all  covers  as 
(S,  T)  with  the  understanding  that  S  may  be  *. 

We  can  use  our  linear  programming  formulation  to  further  characterize  min¬ 
imum  capacity  covers. 

Corollary  3.5  For  each  minimum  cover  (S,  T)  and  maximum  2-Lattice  vector 

x, 


•  q(S)x  =  (3(S ) 

•  a(T)x  =  0 (T)  and, 

•  if  a/(S)  +  a*(T)  >  2,  then  x(()  m  0 

Proof.  By  complementary  slackness.  □ 

Given  a  2-Lattice  polyhedron,  let  fl  be  the  collection  of  all  maximum  car¬ 
dinality  2-Lattice  vectors  and  flat  be  the  collection  of  all  extreme  maximum 
cardinality  2- Lattice  vectors. 

Corollary  3.6  For  each  minimum  cover  (S,  T), 

S,T  *  A(d(x) :  x  €  Q)  ■<  A (d(x)  :  x  €  ft«,t) 

Shapley  and  Shubik  [23]  showed  that  the  collection  of  optimal  dual  solutions 
to  a  bipartite  matching  problem  form  a  lattice.  The  same  result  holds  for 
cardinality  matroid  intersection.  In  particular,  if  ( S,E\S )  and  {S',E\S')  are 
dual  solutions  in  the  sense  of  Corollary  3.4  to  a  matroid  intersection  problem, 
then  so  are  (SnS',£\(SnS'))  and  (SuS\  £\ (SuS'))-  We  show  that  the  set 
of  nested  minimum  covers  for  a  2- Lattice  polytope  forms  an  upper  semi-lattice. 
It  remains  an  open  question  whether  these  covers  in  fact  form  a  lattice. 

Lemma  3.7  If  (Si,7\)  and  (52,T2)  are  nested  minimum  covers,  then  (Si  A 
S2,Ti  V  T2)  and  (Sj  V  S2,T\  A  T2)  are  minimum  covers. 
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Proof.  We  first  show  that  (Si  A  S2, 7\  v  T2)  and  (St  v  S2, 7\  A  T2)  are  covers. 
Since  (Si,  TO  and  (S2,T2)  are  covers  and  a/  is  supermodular  for  each  £  €  L, 

ote(S\  A  S2)  +  q/(Si  V  S2)  +  or<(7\  A  T2)  +  at(Ti  v  T2)  > 

c*t(Si)  +  ae(S2)  +  Q/(7\)  +  q<(T2)  >  4. 

And  so,  we  need  only  consider  the  cases  in  which  a<(Si  V  S2)  +  a*(Ti  A  T2)  or 
a/(Si  A  S2)  +  a/(Tt  V  T2)  is  strictly  greater  than  2. 

Case  1.  Ifa/(Sj  vS2)+q<(TiAT2)  >  2,  either  at<(Si  VS2)  or  o<(Ti  aT2)  =  2. 
However,  since  (Si,  TO  and  (S2,T2)  are  nested, 

(Si  A  S2)  x  (Si  v  S2)  x  (T\  v  T2) 


and 

(Si  A  S2)  x  (Ti  A  T2)  x  (Ti  v  T2). 

So,  q^(Ti  V  T2)  =  2;  proving  that  o<(Si  A  S2)  +  a/(Ti  v  T2)  >  2. 

Case  2.  If  a«(Si  A  S2)  +  ot/(Ti  V  T2)  >  2,  then  q<(Si  a  S2)  >  1  and  so, 
1  <  Qrf (Si  A  S2)  <  a/(Si  v  S2). 


Similarly, 

1  <  a<(Si  A  S2)  <  ot/(T,  A  T2); 

proving  that  ae(Si  V  S2)  +  a*(7i  A  T2)  >  2. 

Thus,  ar< (Si  V  S2)  +  ar<(Ti  A  T2)  ^  2  and  o<(Si  A  S2)  +  q/(Ti  V  T2)  >  2  for 
each  f  €  L,  i.e.,  (Si  A  S2>  7\  V  T2)  and  (Si  V  Si,  T\  A  T2)  are  covers. 

Since  (Si  A  S2tT\  V  T2)  and  (Si  V  S2,Ti  A  T2)  are  covers  and  (Si,  TO  and 
(Si,Tj)  are  minimum  covers 

S(Si  A  S2)  +  S(Ti  v  T2)  >  0(Sx)  +  0(Ti) 

and 

0(S\  V  S2)  +  0(T\  A  T2)  >  0(S2)  +  0(T2). 

But,  since  0  is  subroodular, 

0(Sl/\S2)+0(Si\tS7)+0(TlAT2)+0(TlvT2)  <  0(Sx)+0(S2)+0(Ti)+0(T2). 

Thus,  we  must  have  equality  throughout.  □ 

Let  C  be  the  collection  of  all  nested  minimum  covers.  We  show  that  C  is  a 
upper  semi-lattice  with  partial  order  defined  by  (S,  T )  x  (S',  T ')  if 

•  TxT'  and 
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•  S'  ±  s. 


In  fact,  we  show  that  the  binary  operation  ve  on  C  defined  by 

(s,  t)  ve  (s',  v)  -  (s  a  s',  t  v  r ) 

is  the  join  operation  in  C. 

Lemma  3.8  C  is  an  upper  semi-lattice. 

Proof.  By  Lemma  3.2  and  Lemma  3.7,  (S  A  S'.Tv  T')  is  a  nested  minimum 
cover.  It  is  easy  to  vertify  that  this  is  also  the  least  upper  bound  of  (S,  T)  and 
(S',  T').  Thus,  C  is  an  upper  semi- lattice.  □ 

The  following  example  shows  that  C  need  not  be  a  lattice.  Consider  the 
incidence  2-Lattice  polyhedra  on  £  =  {e,/}  with  the  single  line  l  —  {e,/} 
and  0(S)  defined  by  |S|.  The  nested  minimum  covers  are  (0,£),  ({e},  {e})  and 
({/}.  {/})■  Clearly  (0,£)  is  the  least  upper  bound  of  ({e},  {«=})  and  ({/},  {/}), 
but  these  two  nested  covers  do  not  have  a  common  lower  bound  in  C. 

The  following  corollary  shows  that  there  is  a  largest  cover  in  C  and  in  some 
sense  this  cover  dominates  all  others. 

Corollary  3.9  There  is  a  nested  minimum  cover  (S*,T*),  such  that  T  ■<  T* 
and  S*  <S  for  each  minimum  cover  (S,  T). 

Proof.  Let  (S*,T*)  be  any  nested  minimum  cover  with  the  property  that  no 
nested  minimum  cover  (S,T)  has  T*  -<  T  or  S  -<  S*  (since  there  is  a  finite 
bound  on  the  length  of  any  chain  in  T,  such  a  cover  exists).  Suppose  that  (5,  T) 
is  a  minimum  cover  with  T  ■£  T*  or  S*  S.  By  Lemma  3.2  (5  A  T,  S  v  T) 
is  a  nested  minimum  cover.  So,  by  Lemma  3.7,  (SaTaS*,5vTvT*)  isa 
nested  minimum  cover  and  T*  -<SvTvT*  or  S  hT  hS*  -<  S*  contradicting 
the  choice  of  (5*,  7**).  □ 

We  refer  to  the  nested  minimum  cover  of  Corollary  3.9  as  the  dominant 
cover. 


4  The  Dominant  Cover 

The  most  common  lattice  polyhedra  with  a  €  {0, 1,2}  include  bipartite  match¬ 
ing  polyhedra  and  matroid  intersection  polyhedra.  In  each  case,  a/(S )  =  |/nS|. 
Here  we  generalize  this  relationship  between  a  and  0.  Let  £  be  a  (possibly  infi¬ 
nite)  set,  and  let  L  be  a  finite  subset  of  2s  (generally  chosen  to  be  a  collection 
of  pairs  from  £). 

We  require  that  T  contain  0  and  be  partially  ordered  by  set  containment. 
Recall  that  we  associate  with  each  set  S  C  £  the  smallest  member,  o(S),  of  T 
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containing  S.  We  extend  the  meet  and  join  operation  of  T  to  all  subsets  of  £  so 
that  for  S  and  T  C  £,  S  A  T  =  o(S)  A  a (T)  and  S  v  T  =  o(S  U  T). 

We  further  require  that  0  be  normalized,  non-decreasing  and  satisfy  0(o(e}) 
—  1  for  each  e  €  £ ,  and  0(o(l})  =  2  for  each  i  €  L.  Finally,  we  model  the 
relationship  between  a  and  0  via  the  condition  Q/(S)  =  0(1  A  S)  for  each  i  €  L 
and  S  €  T.  It  is  easy  to  see  that  at  is  normalized  and  non-decreasing.  It  is 
also  straightforward  to  prove  (see  (25))  that  at  is  supermodular.  We  call  the 
resulting  2-Lattice  polyhedra  Matching  2-Lattice  polyhedra. 

The  following  definitions  prove  useful: 

Definition  7  A  base  of  a  subset,  S  C  £,  is  a  minimal  subset  T  C  S  with 
o(T)=ct(S). 

For  example,  if  F  is  the  collection  of  fhts  in  a  matroid,  then  a  maximal  inde¬ 
pendent  set  in  5  is  a  base  of  5.  If  T  is  the  collection  of  linear  subspaces  of  a 
vector  space  a  maximal  linearly  independent  set  of  vectors  in  5  is  a  base  of  5. 

Definition  8  For  T  6  T,  let  Lr  ~  {1  €  L  :  at(T)  —  1}. 

Lemma  4.1  shows  that  given  one  element  of  a  nested  minimum  cover,  we 
can  characterize  the  other. 

Lemma  4.1  //(S,T)  is  a  nested  minimum  cover  then  S  —  :  t  e  Lr}) 

and  T  Sv  o(((  €  L  :  at(S)  =  0}). 

Proof.  Since  ( S,T )  is  a  nested  cover,  S'  =  o({l  A  T  :  t  e  Lr})  Q  S.  Further, 
since  ( S',T )  is  a  cover, 

0  (S)  +  0(T)<0(S')  +  0(T). 


It  follows  that  S'  —  S. 

Similarly,  since  (5,  T)  is  a  nested  cover,  T'  =  SVo({t  e  L  :  at(S)  =  0})CT 
and  since  (5,  V)  is  a  cover, 

0(S)+0(T)<0(S)+0(T'). 

It  follows  that  T'  —  T.  □ 

Now  we  characterize  the  dominant  cover  in  terms  of  maximum  Matching 
2-Lattice  vectors. 

Let  (5,T)  be  a  nested  cover  and  for  each  l  €  Lr,  let  e(f)  €  /  A  T.  Define 
the  matroid  Mi(S,T)  with  rank  function  rt  on  Lr  as  follows.  A  set  A  of  lines 
in  Lr  is  independent  in  M\(S,T)  if  /?({e(f) :  t  €  X})  =  |Aj. 

Define  the  matroid  Mi (S,  T;  {e})  with  rank  function  r?  on  Lr  as  follows.  A 
set  X  of  lines  in  Lr  is  independent  in  Mi(S,T\  {e})  if  0(X/Tv  {e})  =  |AT|. 

Lemma  4.2  shows  that  if  the  maximum  cardinality  of  an  intersection  in  Mi 
and  Mi  is  0(S )  -  1,  there  is  a  cover  (S',  T')  with  rveCf. 
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Lemma  4.2  If(S,  T)  is  a  nested  minimum  cover  and  e  &T,  then  the  maximum 
cardinality  of  an  intersection  in  Mi(S,T)  and  M2 (S,  T;  {e})  is  either  0(S)  or 
0(S)  —  1.  Puthermore,  if  the  maximum  cardinality  of  an  intersection  in  Mi  and 
M2  is  0(S)  -  1  then  there  is  a  minimum  cover  ( S',T ')  such  that  T  V  e  CT' . 

Proof.  The  maximum  cardinality  of  an  intersection  in  M\  and  M2  is  bounded 
by  0(S).  Suppose  the  maximum  cardinality  of  an  intersection  in  Mi  and  M2 
is  less  than  or  equal  to  0(S)  -  1,  then  there  is  a  minimum  rank  cover  (Jfi,  X2) 
of  Lt  for  the  matroid  intersection  problem  such  that 

ri(Xx)  +  r2(X2)<0(S)-l, 


that  is, 

(3({e(i)  :t£Xi})  +  0(X2/(T  V  e))  <  0{S)~  1 

and  so 

0{{e(i)  :eeXi})  +  0(X2  V  T  V  e)  <  0 (S)  +  0{T  V  «)  -  1  =  0 (S)  +  0{T). 

Let  S'  =  o({c(f)  :  l  €  ^1})  and  V  —  X2  V  T  V  e.  Then  (5',T')  is  a  cover  of  L 
with  T  v  e  C  V  and  0(S',T’)  <  0(S,  T).  Since  ( S,T )  is  a  minimum  cover,  it 
follows  that  (S',  T ')  is  a  minimum  cover  and  the  size  of  a  maximum  intersection 
must  be  at  least  0(S)  —  1 .  □ 


Corollary  4.3  //(S*,  T*)  is  the  dominant  cover  and  e  £  T*,  then  the  maximum 
cardinality  of  an  intersection  in  Mi(S* ,T*)  and  M2(S*,T*\ {e})  is  0{S'). 

The  following  two  lemmas  identify  special  properties  of  maximum  Matching 
2-Lattice  vectors  and  show  conditions  under  which  we  may  combine  portions  of 
two  Matching  2- Lattice  vectors  to  form  a  third. 

Lemma  4.4  Let  x  be  a  maximum  Matching  2-Lattice  vector  and  let  (5,  T )  be 
a  nested  minimum  cover.  Then  xi\iT  satisfies 

1.  a (T)xLKLt  =  0{T/S)  and 

2.  for  T  C  T,  «(T')xLXLt  <  0(T'/S) 
and  xlt  satisfies 

3.  a(T)xLr=0(S), 

4.  for  T  C  T,  a(T')xLr  <  0(T'  A  S). 

Proof.  First,  observe  that  for  each  line  l  e  L\Lr,  ae(T)  =  2.  So,  if  ae{S)  >  0, 
ae(S)  +at(T)  >  2  and,  by  Corollary  3.5,  x(t)  =  0.  Thus,  a(S)xiXiT  =  0.  Since 
a(S)x  =  0 (S),  it  follows  that  a(S)xcT  =  0{S). 
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To  see  (3),  observe  that  for  each  £  €  Lt,  =  a <(S)  =  1.  So, 
a{T)xir  —  a{S)xiT  =  0{S). 

To  see  (1),  observe  that  since 

a(T)x  =  0(T)  =  0(T  V  5)  and  a(T) xLr  =  0(S) 

it  follows  that 

a(T)  *tUr  =  I3(T/S). 

To  see  (2),  observe  that  for  T'  C  T, 

a(T'  V  S) x  <  0(T'  V  S)  and  a(T'  V  S)itr  =  j9(S). 

Thus, 

a(T')xt\tT  <  a(T'  V  S)xc\lt 

=  a(rvS)t-o(rvS)iiT 

<  p(r  v  s)  -  p(s) 

=  w/s). 

To  see  (4),  note  that  for  f  €  Lt,  o>e(S)  —  ae(T).  So, 

a(r)xtr  »  a(T'  A  S)xiT  <  0{T  a  S). 

□ 

Lemma  4.5  Let  x  and  x  be  Matching  2-Lattice  vectors  and  let  ( S ,  T)  be  a  nested 
minimum  cover.  If  x  satisfies  (1)  and  (2)  of  Lemma  4-4,  *  satisfies  (3)  and  (4) 
of  Lemma  4-4, 

a.  0(T/cI(xLt))=0(T/S)  and 

b.  a(cl(iLr))x t\Lr  =0, 

then  x'  —  xiT  +  xi\r  is  a  Matching  2-Lattice  vector. 

Proof.  Suppose  x'  is  not  a  Matching  2- Lattice  vector,  then  there  is  a  flat 
Z  €  T  such  that  a(Z)x'  >  (3(Z).  We  first  show  that  we  may  choose  Z  to 
contain  d(x&r). 

By  condition  (b),  a(Z  A  d(i£T))x'  =  ct(Z  a  d(xtT))x£,T,  and  since  xlt  is 
feasible  a(Z  A  cl(xLr))xLr  <  0(Z  A  d(xtr)).  It  follows  by  Corollary  2.2  that 

a{Z  V  cI(xLt))x'  >  0(Z  v  cl{xiT)). 
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Thus,  if  x7  is  not  a  Matching  2-Lattice  vector,  there  is  a  flat  Z  €  V  with 
d(ii- ,.)  C  Z  such  that  a(Z)x'  >  0(Z). 

We  next  show  that  we  may  also  assume  T  C  Z. 

By  conditions  (1)  and  (3) 

a{T)x'  =  a(T)xLr  +  a(T)xL\Lr  =  0(S)  +  0  (T/S)  =  0(T). 

F\irther,  by  conditions  (2)  and  (4) 


a(Z  A  T)x'  —  a(Z  A  T)xir  +  or (Z  A  T)x/,\£T  <  0(Z  A  T  A  S)  +  $((Z  A  T)/S). 

Therefore,  a(Z  A  T)x'  <  0(Z  A  T),  and  so  it  follows  by  Corollary  2.2  that 
aiZvT)!*  >  0(ZVT).  But, 


a(ZvT)!'  = 


< 


a(Z  V  T)xlt  +  a(Z  v  T)xi\lt 
0(d(xtT))  +  a(Z  V  T)xL\Lt 
0(d{xLr))  +a{T)xL\Lr 
0(d(xLr))+0(T/S) 
0(d(xLT))+0(T/d(xLr)) 
0(d{xLr)vT) 

0(Z  V  T) 


since  d{iiT)  C  Z 
since  supp(xc\Lr)  Q  T 
by  (1) 
by  (a) 

since  d(xj,T)  C  Z. 


This  contradicts  the  existence  of  Z  and  proves  that  x'  is  a  Matching  2-Lattice 
vector.  □ 


Lemma  4.6  shows  that  if  (5,  T)  is  a  nested  minimum  cover  and  e  ^  7\  then 
each  /3(S)-intersection  in  M\(S,T)  and  Mi{S,T\  {e})  gives  rise  to  a  maximum 
Matching  2-Lattice  vector  x  with  e  £  cf(x). 

Lemma  4.6  Let  x  be  a  maximum  Matching  2-Lattice  vector,  (5,  T)  he  a  nested 
minimum  cover  and  e  &  T.  If  X  is  a  0(S)  intersection  in  M\(S,  T)  and 
Mi(S,T;  {e}),  thenx1  defined  by 

f  1  iflzX 

x'(f)  =  <  0  if  t€  Lt\X 

x(t)  otherwise 

is  a  maximum  cardinality  Matching  2-Lattice  vector  with  e  $  cf(x'). 

Proof.  First,  since  X  is  independent  in  Aia(5,T;  {e})  and  |XJ  =  0(S), 

0(X/Tv  {«})  =  |X|  =  0(S) 

and  so 

0(X  v  T  V  {e})  =  0{S)  +  0(T  V  {e}). 
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Second,  since  X  is  independent  in  Mi(S,T)  and  |X|  =  0(S),  {e(£)  :  i  €  X} 
is  a  base  of  S.  Let  { e(t )  :feAf}uflbea  base  of  T.  Then  e  £  u(X  u  B)  and 


0{X/{B  U{e})) 


/3(XuBu{e})-/3(Bu{e}) 
0(S)  +  0(T  V  {e})  -  0(B  U  {e}) 
2  0(S) 

2\X\. 


It  follows  that  x'L 
We  see  that  z 
each  t  €  X, 


is  a  Matching  2-Lattice  vector. 

'Lt  satisfies  (3)  of  Lemma  4.4  as  follows.  Since  a <(T)  =  1 


a(T)x'LT  =  \X\  =  0(S). 


for 


We  see  that  x'Lr  satisfies  (4)  of  Lemma  as  follows.  Since  a*(S)  =  at(T)  =  1 
for  each  t  €  X,  if  V  C  T, 


oC T')x'Lr  =  a(T  A  S)x'Lt  <  0{T  A  S ). 

Since  *  is  a  maximum  Matching  2-Lattice  vector,  xl\lt  satisfies  conditions 
(1)  and  (2)  of  Lemma  4.4.  Thus,  to  show  that  x'  is  a  Matching  2-Lattice 
vector,  we  need  only  show  that  x'Lt  and  xi\iT  satisfy  conditions  (a)  and  (b)  of 
Lemma  4.5. 

We  see  that  satisfies  (a)  of  Lemma  4.5  as  follows.  Since 
d(x'LT)  C  a{supp{x’Lr))  =  a(X) 


and 

a(a(X))x'Lr=2\X\=0(a{X)), 
it  follows  that  d(z^T)  =  <r(X).  Therefore, 

0(T/cI(x'Lt))  -  0(T/X)  -  \B\  «  0{T/o{{e(C) :  t  €  X}))  -  0(T/S). 

We  see  that  *tT  and  T  satisfy  condition  (b)  of  Lemma  4.5  as  follows. 
Since  supp(z/UiT)  £  T  and  d^,.)  =  <r(X), 

a(d(*i,r))xt\tT  =  a(X  *T)xi\lt- 

But  X  A  T  =  S  so 


o(d(x  lt))xl\lt  =  <x(S)xL\Lt  =  0. 

Thus,  by  Lemma  4.5,  z*  is  a  Matching  2-Lattice  vector. 

Since  e  g  o(X  U  T)  and  d(z')  C  <j(rupp( x/))  C  ff(X  U  T),  it  follows  that 
e  d(z'). 
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To  see  that  x'  is  a  maximum  cardinality  Matching  2-Lattice  vector,  observe 
that 


£*'(*)  =  £*'(*)+  £  x(o 

*€£  /etr  t€L\Lr 

=  /3(S)  +  £x(0-  £  x(<) 

i€L  i€LT 

=  /3(S)  +  0(S,T)-  £  x(<) 

/€l»T 

>  /3(S)  +  /3(S,  T)  -  /3(S) 

=  0(S,T) 


□ 


Corollary  4.7  If  (S*,T*)  is  the  dominant  cover,  then  T*  D  n(d(x) :  x  €  Cl) 

Proof.  By  Corollary  4.3,  if  e  &  Tm,  then  the  maximum  cardinality  of  an 
intersection  in  Mi(S*,T*)  and  {e})  is  0[S*).  By  Lemma  4.6, 

there  is  x  €  ft  such  that  e  &  d(x),  hence,  e  £  n(d(x)  :  x  €  ft).  Therefore, 
V  D  D(d(x)  :  x  €  Q)  □ 

Combining  Corollary  3.6,  Corollary  4.7  and  Lemma  4.1,  we  have  the  fol¬ 
lowing  characterization  of  the  dominant  cover  in  terms  of  maximum  Matching 
2-Lattice  vectors. 

Theorem  4.8  LetT*  —  n(d(x)  :  i  €  Q)  and  S*  =  o({t  A  T*  :  t  €  L-r*})- 
Then  (S*,T*)  u  the  dominant  cover. 

The  following  results  refine  Lemma  4.6  to  extreme  maximum  Matching  2- 
Lattice  vectors. 

Lemma  4.9  Let  (5*,  7"*)  be  the  dominant  cover.  Then,  for  each  x  e  (lent 

1.  For  each  l  €  Lt •»  *(0  €  {0, 1}, 

2.  0(T/d(xLr.))  =  0(r/S*),  and 

3.  T*  A  d(xtT.)  *  S*. 

Proof.  For  each  x*  €  ftMt.  there  is  a  complementary  dual  solution  y*.  Let 
5  ss  {5* :  t  =  1,  •  •  • ,  t}  be  a  nested  family  of  fiats  in  T  and  N  a  subset  of  L  such 
that  x*  is  the  unique  solution  to  the  system: 

a(S<)x  =  0{Si)  for  each  €  S 

x{()  ~  0  for  each  f  6  fV 
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and  y*  is  the  unique  solution  to  the  system: 


T.  y(Si)at(Si)  =  1  for  each  i  €  L\N 

Si£S 

By  arguments  similar  to  those  used  in  the  proof  of  Theorem  3.1,  there  are  two 
indexes  »i  and  »2,  »i  <  ia,  *1,  €  {0, 1,  •  •  • ,  t}  such  that 

•  Si,  •  •  ■ ,  Sj,  correspond  to  the  vertices  in  G(S,  L\  N)  that  have  an  odd 
number  of  edges  in  the  unique  path  from  S,  to  the  root; 

•  Sit +i,  •  •  • ,  Si,  correspond  to  the  vertices  in  G(S,  L\N)  that  have  no  path 
from  S4  to  the  root; 

correspond  to  the  vertices  in  G(S,  L\N)  that  have  an  even 
number  of  edges  in  the  unique  path  from  Si  to  the  root;  and 

•  (Si,, Si,)  forms  a  minimum  cover. 

Since  S4,  C  T*.  if  a<(T")  =  1,  then  a<(Si,)  =  1.  Clearly,  if  l  6  N,  then 
x*(i)  =  0.  If  l  &  N  and  a <(T*)  =  1  then  l  must  correspond  to  an  edge  in  a  tree 
component  of  G(S,  L  \  N).  Therefore,  x*(t)  €  {0, 1}  if  a<(T*)  =  1. 

To  see  (2),  observe  that  by  Corollary  3.5,  ar/(S*)x(0  =  0  for  each  t  €  L\Lr- 
and  a/(S*)  =  1  for  each  t  €  Lr«.  It  follows  that 

a(S*)x=  Y.  *(0-/*(S-). 

Kin- 

Further,  since  xiT.  is  integral,  d^x^. )  =  <r(jupp(xtT. ))  and  so 

a(d(xtT.))x  =  2  J]i(0  =  2/3(S*)  =  0(d(xLr.))  (4.6) 


and 

q(T*  V  d(xLr.))x  -  2£x(*)  -  0(D  +  0(F)  =  0{Tm  V  d(xtT.)).  (4.7) 

kl 

Combining  (4.7)  and  (4.6)  we  see  that  0{T* /d(xtT.)  =  0(T*/S*). 

Finally,  to  see  (3),  observe  that  S*  C  T*  A  dix^ ),  but  since 

0{T  V  d(xtT. ))  +  /3(T*  A  d{XLr. ))  <  0(T*)  +  /3(d(xi,T. )), 

it  follows  that  0{T *  A  d(xtT.))  <  /?(S*).  □ 
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Corollary  4.10  Let  x  be  an  extreme  maximum  Matching  2-Lattice  vector,  (S’, 
T*)  be  the  dominant  cover  and  e  £  TV  If  X  is  a  P(S’)  intersection  t n 
Mx(S'J -)  and  M2(S*,T*]  {e}),  then  defined  by 

[  l  iftex 

x'(f)  =  ^  0  t fi€LT-\X 

[  x(t)  otherwise 

is  an  extreme  maximum  cardinality  Matching  2- Lattice  vector  uhth  e  &  cl(x'). 

Proof.  In  Lemma  4.6,  we  showed  that  x'  6  ft.  If  x'  is  not  extreme,  there  is  a 
subset  {x1,!2,  •  •  ■  ,x*}  of  distinct  vectors  in  flat,  such  that 

x'  =  Aix1  +  A2X2 - 1-  A*x* 

for  some  A  =  (At,  Aj,  •  •  • ,  A*)  >0  with  £  \  =  1.  We  show  that  x*  =  xir.  + 
x\\iT,  is  in  ft  for  each »  €  (1, . .  • ,  k)  as  follows. 

Since  x  €  fl,  xLt.  satisfies  conditions  (3)  and  (4)  of  Lemma  4.4.  Similarly, 
since  x*  e  ft,  x'l\lt  sat*sfiea  conditions  (1)  and  (2)  of  Lemma  4.4  for  »  = 
1.2  ,•••,*. 

By  (2)  of  Lemma  4.9,  xtT.  satisfies  (a)  of  Lemma  4.5.  And,  since  supp(xlL^LTm ) 
C  T* 

<*(cf(xtT.  ))x^r.  =  a (d(xLr. )  A  T*)xIKLt,  . 

But  d(xtT. )  A  T*  —  S*  so 

o(d(xLT.))xi,NtT.  =a(S*)x’tXia..  =0; 

proving  that  xlt.  and  x^ir.  satisfy  condition  (b)  of  Lemma  4.5. 

Thus,  by  Lemma  4.5,  is  a  Matching  2-Lattice  vector  for  each  t  €  {1, ....  At}. 
Since 


xL\Lt.  -  x>L\Lt .  -  xi\tT .  +  ^2zl\tT* - *■  **Z£\LT« 

it  follows  that 

x  =  Aix1  +  Ajz2  ••■  +  A*z*. 

Further,  since  €  {0, 1},  x^T.  =  x'lr%  for  i  =  1, . . . ,  k.  Hence,  the  members 
of  {x4^.  :  »  €  (1 , . . . ,  Ac] }  are  distinct  and  therefore  so  are  the  members  of 
{** : »  €  [1, . . . ,  Ac]}.  This  contradicts  the  assumption  that  x  is  extreme.  □ 


Corollary  4.11  Let  ( S*,T *)  be  the  dominant  cover,  then 

T  =  n(d(x)  :  x  €  ft)  =  n(d(x)  :  x  €  ft„«). 

In  the  case  of  matroid  intersection,  we  have  the  following  characterization. 
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Corollary  4.12  Let  Mi  be  a  matroid  with  rank  functionrx  and  closure  operator 
ox  and  let  M2  be  a  matroid  with  rank  function  r2  and  closure  operator  a-x  both 
defined  on  the  same  ground  set  E  and  let  fl^,  be  the  collection  of  all  maximum 
cardinality  intersections  in  Mi  and  M2-  Then  for  each  I  € 

|/|  =  ri(Ti)  +ra(E\Ti)  =  rx(E\T2)  +  r2(Ta), 


where 

Tx  =  n(ffi(Z)  :  /  €  Hast))  and 
T2  —  n(ff2 (/) :  l  €  fie**). 
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Abstract 


We  introduce  the  framework  for  a  primal  dual  integer  programming  algorithm.  We 
prove  convergence,  and  discuss  some  special  cases. 
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1  Introduction 


Many  optimization  algorithms  are  based  on  the  relationships  derived  from  linear  program¬ 
ming  duality  theory.  While  Chvatal  [5],  Blair  and  Jeroslow  [2],  Johnson  [16]  and  Wolsey 
[19]  have  developed  a  rich  duality  theory  for  integer  programming,  this  theory  has  not  yet 
been  exploited  algorithmically.  We  propose  a  method  which  uses  Chvatal  functions  to  form 
a  generic  primal  dual  algorithm  for  general  integer  programs.  When  certain  subproblems 
can  be  solved  efficiently,  this  procedure  will  solve  0-1  integer  programs  in  time  that  is  pseu¬ 
dopolynomial  in  the  size  of  the  problem.  Although,  except  in  special  cases,  there  will  not 
be  a  polynomial  time  algorithm  to  solve  the  subproblems,  they  can  always  be  solved  by 
generating  cutting  planes. 

In  this  section  we  present  some  background  material.  The  second  section  contains  a 
description  of  the  algorithm,  and  a  proof  of  convergence.  In  the  third  section,  we  discuss 
some  interesting  special  cases  and  show  how  the  algorithm  generalizes  Gomory’s  familiar 
cutting  plane  algorithm.  Let  Q  denote  the  rational  numbers,  and  let  Z  denote  the  integers. 
Q+  and  Z+  will  denote  the  nonnegative  rationals  and  integers  respectively.  Further,  |_xj  will 
denote  the  greatest  integer  less  than  or  equal  to  z.  If  /  is  a  function,  then  the  function  |_/J 
is  defined  by  [/J(z)  =  [f{x)\. 

1.1  The  Superadditive  Dual  of  an  Integer  Programming  Prob¬ 
lem 

Let  Sn  denote  the  set  of  n-dimensional  superadditive  functions.  (A  function  is  superadditive 
if  /(a  +  b)  >  f(a)  +  f(b )  for  all  a  and  b.)  A  duality  involving  superadditive  functions  holds 
for  integer  programming  problems  (see  e.g.  [19];.  We  briefly  outline  this  duality  here. 
Consider  the  integer  programming  problem 

(P)  max  cx 

s.t.  Ax  =  b 

x  >  0  and  integer, 

where  A  is  an  integral  m  x  n  matrix,  c  is  an  integral  n-vector  and  b  is  an  integral  m-vector. 
If  a.j  denotes  the  j th  column  of  A,  and  c,  is  the  jth  component  of  c,  then  the  supemdditive 
dual  of  (P)  is: 
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(D)  min  f(b) 

s.t.  f(a-)  >  ci  ,  for  j=l,. . .  ,n 

/  €  Sm- 

The  following  weak  and  strong  duality  properties  hold  for  (P)  and  (D). 

Weak  Duality  Property:  Let  x  and  /  be  feasible  solutions  for  (P)  and  (D)  respectively. 
Then  cx  <  f(b). 

Strong  Duality  Property:  Let  x*  be  an  optimal  solution  to  (P).  Then  (D''  has  an  optimal 
solution  /*  and  cx *  =  /*(&). 

The  Weak  Duality  Property  is  easily  verified  using  superadditivity.  The  Strong  Duality 
property  follows  from  the  fact  that  the  value  function  of  (P)  is  superadditive. 

Blair  and  Jerojlow  ([2])  have  shown  that  the  duality  still  holds  even  when  the  class  of 
functions  is  restricted  to  be  Chvdtal  Functions.  The  class  of  Chvatal  functions  can  be  defined 
as  follows.  Let  Ln  denote  the  set  of  n-dimensional  linear  functions  with  rational  coefficients. 

Definition:  The  class  C„  of  n-dimensional  Chvatal  functions  is  the  smallest  class  K  satis¬ 
fying  the  following  properties: 

1.  If  /  €  Ln  then  /  6  K; 

2.  If  f,g  e  K  and  a, 0  €  Q+,  then  af  +  fig  6  K\ 

3.  If  /  G  K,  then  [f\  €  K. 

Note  that  a  Chvatal  function  is  superadditive,  and  that  the  class  of  Chvatal  functions 
contains  the  linear  functions.  The  Weak  Duality  Property  still  holds  since  the  Chvatal 
functions  are  superadditive.  That  the  Strong  Duality  Property  still  holds  for  this  restricted 
class  is  nontrivial,  and  the  interested  reader  is  referred  to  [2]. 

1.2  A  Separation  Theorem  and  Optimality  Conditions 

As  a  consequence  of  strong  duality  the  following  proposition  holds.  (See  I2j,  or  derive  Propo¬ 
sition  1.2.1  by  applying  the  Strong  Duality  Property  to  an  integer  programming  problem 
with  artificial  variables  and  a  phase  1  objective.) 
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Proposition  1.2.1  [2]  Let  A  be  anmxn  matrix  with  integer  entries,  and  let  b  be  an  integral 
m-vector.  Then  exactly  one  of  the  following  alternatives  holds: 

1.  There  exists  x  £  Z+  with  Ax  =  6; 

2.  There  exists  f  £  Cm  with  f{af)  >  0  for  all  j  =  1,. . .  ,n  and  f(b)  <  0. 

Blair  and  Jeroslow  ([3])  show  that  the  separating  function  of  alternative  2  may  have  an 
exponential  nesting  of  round-downs,  so  that  Proposition  1  does  not  give  an  NPfico-NP 
characterization  of  the  integer  programming  feasibility  problem. 

We  conclude  this  section  by  presenting  the  optimality  conditions  for  (P)  and  (D)  (given 
here  as  Proposition  1.2.2).  Similar  conditions  involving  general  superadditive  functions  are 
given  in  [16]  and  [19].  We  omit  the  straightforward  proof  of  Proposition  1.2.2. 

Proposition  1.2.2  Let  x  and  f  be  feasible  solutions  to  the  problems  (P)  and  (D)  given  above. 
Then  x  and  f  are  optimal  solutions  if  and  only  if  the  following  two  conditions  hold: 

1.  (Complementary  Slackness)  For  all  j  =  1, . . . ,  n,  if  f(a:)  >  c}  then  Xj  —  0; 

2.  (Complementary  Linearity)  /(flj)Xj  =  f(b). 

Note  that  Condition  1  is  analogous  to  the  usual  complementary  slackness  conditions  of 
■<** 

linear  programming.  In  the  event  that  /  is  a  linear  function,  Condition  2  holds  trivially. 

2  The  Algorithm 

In  this  section  we  introduce  our  primal  dual  algorithm  using  the  duals  introduced  in  the 
last  section.  There  will  be  a  couple  of  places  where  the  actual  details  will  be  left  as  “black 
boxes.”  This  is  done  for  several  reasons.  Primarily,  this  allows  us  to  present  the  algorithm 
as  a  generic  framework  into  which  many  different  implementations  may  be  built.  We  will 
prove  what  is  necessary  for  these  black  boxes  to  do  in  order  to  assure  finite  convergence  of 
the  algorithm  and  then  in  the  next  section  we  discuss  a  few  implementations  that  satisfy 
these  requirements. 
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2.1  The  Basic  Steps 

As  mentioned  in  the  last  section,  we  always  will  assume  that  all  data  is  integral. 

Primal  Dual  Algorithm 

Input:  Integral  mxn  matrix  A,  integral  n-vector  c  and  integral  nonnegative  m-vector  b. 
Ouput:  Integral  nonnegative  n-vector  x  optimizing  integer  programming  problem  (P), 
or  information  that  (P)  is  infeasible  or  unbounded;  and  a  Chvatal  function  /  optimizing  the 
dual  problem  (D),  or  information  that  (D)  is  infeasible  or  unbounded. 

Step  1:  Let  /  be  a  dual  feasible  function.  Without  loss  of  generality,  assume  that  /  =  [/J. 
Let  J  =  {j  G  {1, . . .  ,n}\f(a.j)  =  Cj}.  [If  no  such  /  exists  then  (D)  is  infeasible,  and 
(P)  is  infeasible  or  unbounded.  STOP.] 

Step  2:  Consider  the  integer  program  that  looks  for  an  integral  solution  to  (P)  using  only 
coordinates  in  the  set  J.  Call  this  problem  (RFP)  [restricted  feasibility  problem]: 

(RFP)  max  W  =  f]  -  x“ 

«=i 

s.t.  +  x®  =  bi  for  *  =  1, . . . ,  m 

all  x  >  0  and  integer. 

There  are  three  possibilities: 

•  W*  <  0.  Go  to  Step  3. 

•  W *  =  0  and  Yljsj  =  /(&)•  Go  to  Step  6. 

•  W *  =  0  and  Ylj^j  <  /(&)•  Go  to  Step  7. 

Step  3:  From  Proposition  1.2.1  it  is  clear  that  there  exists  a  Chvatal  function  /  such  that: 

/(a,)  >  0  Vj€J, 

m  <  -l. 

There  are  two  possibilities: 
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•  3  some  j  J  with  f{dj)  <  0.  Go  to  Step-4. 

•  f(cLj)  >  0  Vj.  Go  to  Step  5. 


Step  4: 


Let  9-  min  (S-JM). 

„i'L.\  M)  I 


f(oj)<  0 


Then  set  /  =  [/  4-  0/J.  Replace  f  by  f  and  update  J.  Go  to  Step  2. 

Step  5:  Here  it  is  clear  that  (D)  is  unbounded  and  hence  (P)  is  infeasible.  STOP. 


Step  6:  Here  x*  and  /  satisfy  the  optimality  conditions  and  so  are  optimal.  STOP. 

Step  7:  Now  consider  the  following  problem  that  attempts  to  push  the  objective  up  to  f{b) 
while  maintaining  integrality.  This  problem  is  called  (RMP)  [restricted  maximization 
problem]: 

(RMP)  max  V  =  £/(aj)x, 

iZJ 

s.t.  y^at,x,  —  bj  for  i  —  1, . . . ,  m 

itJ 

Y,f(ai)x>  ^  /(6) 

j€J 

all  x  >  0  and  integer 

There  are  two  possibilities: 


•  V*  =  f(b)  and  all  x*  for  j  €  J  are  integral,  where  x*  is  the  optimal  solution  of 
(RMP).  Go  to  Step  6. 

•  Otherwise  (there  is  no  integer  solution  with  value  f(b))  go  to  Step  8. 

Step  8:  It  is  known  by  the  strong  duality  property  that  there  exists  a  Chvatal  function  / 
such  that 

/(a#)  >  M)  V;  €  J, 
m  <  /(b). 

There  are  two  possibilities: 
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•  If  f(a,j )  >  Cj  Vj'  then  /  is  dual  feasible.  Replace  /  by  |/J  and  update  J.  Go  to 
Step  2. 

•  Otherwise,  go  to  Step  9. 


Step  9: 


Let  A  =  max 
>/(<»>) 


Cj~/(Qj)  1 

/(aj)  ~  /(<*;)  i 


and  set  /  =  |A/  +  (1  -  A)/J.  Then,  /  is  a  dual  feasible  function.  Replace  f  by  f  and 
update  J.  Go  to  Step  2. 


We  conclude  this  subsection  by  noting  that  the  algorithm  only  requires  storing  the  values 
of  /(dj),  for  1  <  j  <  n,  and  /(ft),  which  can  be  updated  as  the  algorithm  progresses.  It  is 
not  necessary  to  store  a  representation  of  the  entire  function  /. 


2.2  Correctness  and  Finite  Convergence 

Now  let  us  look  in  detail  at  some  of  the  steps  of  the  algorithm  defined  in  Section  2.1.  In 
Step  1,  it  is  not  difficult  to  get  an  initial  dual  feasible  function;  in  particular,  one  can  solve 
the  linear  programming  relaxation  of  (P)  and  use  the  (linear)  dual  function.  Since  we  have 
assumed  that  all  data  is  integral,  it  is  clear  that  if  the  linear  programming  dual  is  infeasible 
then  (D)  is  also  infeasible.  Denote  by  /0  the  function  used  by  Step  1.  In  Step  2,  it  is  clear 
that  (RFP)  is  always  feasible  (set  x“  =  bi  for  all  i  and  x,  =  0  for  all  j  6  J).  Further,  as  all 
the  xf  are  nonnegative,  it  is  clear  that  the  objective  will  always  be  nonpositive. 

Consider  9  arising  in  Step  4.  By  construction  it  is  clear  that  the  denominator  is  less  than 
zero.  Further,  by  dual  feasibility  of  /  and  the  definition  of  J ,  the  numerator  is  also  negative, 
hence  9  will  be  positive.  Now,  since  a  positive  linear  combination  of  Chvatal  functions  is  a 
Chvatal  function,  /  will  be  a  Chvatal  function.  Hence,  by  the  construction  in  Step  3,  /  will 
be  dual  feasible  so  it  is  justified  to  go  to  Step  2  with  /.  Note  that  by  Step  3,  f(b)  <  0  and 
hence  /(6)  <  f(b).  Further,  by  definition,  /  is  integral  so  f(b)  <  f(b)  —  1. 

The  algorithm  reaches  Step  5  when  the  /  found  in  Step  3  satisfies  /(%)  >  0  V).  Consider 
the  function  /  =  |y  4-c/j.  Clearly,  this  function  will  be  dual  feasible  for  any  nonnegative 
e.  Further,  as  e  approaches  infinity,  f{b)  approaches  negative  infinity.  Hence  the  statement 
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made  in  Step  5  is  correct,  i.e.,  (D)  is  unbounded.  By  Weak  Duality  it  is  clear  then  that  (P) 
is  infeasible. 

Step  6  can  reached  in  two  ways.  In  either  case  (from  Step  2  or  Step  7),  we  have  an  x *  that 
satisfies  (P)  for  which  x*  >  0  =►  /(a;)  =  Cj,  so  complementary  slackness  holds  (Condition 
1  of  Proposition  1.2.2).  Further,  we  know  that  E>ejCjZ*  =  Y.}zj  f{a})x)  =  /(&)■  Hence 
complementary  linearity  holds  (Condition  2  of  Proposition  1.2.2).  Thus  by  Proposition  1.2.2, 
x *  and  /  are  optimal. 

In  Step  8,  clearly  if  f(a})  >  c,  for  all  j  then  /  is  dual  feasible  so  it  is  acceptable  to 
return  to  Step  2.  Further,  note  that  by  construction  f(b)  <  f(b)  and  since  |/J  is  integral, 

[m\  <  m  - 1. 

Consider  A  found  in  Step  9.  Clearly  by  construction  the  numerator  is  positive.  Further, 
since  /  is  dual  feasible  we  know  that  f{a.j)  >  c:  Vj  and  since  we  also  know  by  construction 
that  within  the  definition  of  A,  J{a})  <  Cj,  it  is  clear  that  0  <  A  <  1.  Thus,  since  a  convex 
combination  of  Chvatal  functions  is  a  Chvatal  function,  /  is  a  Chvatal  function.  It  is  not 
difficult  to  see  that  /  is  dual  feasible  and  hence  it  is  acceptable  to  return  to  Step  2  with  /. 
Further,  since  both  f(b)  and  (hence)  /(&)  are  less  than  f(b),  and  since  /  is  integral,  it  is 
clear  that  /(&)  <  f(b)  -  1. 

Now  consider  the  flow  of  the  algorithm.  Notice  that  each  time  the  algorithm  leaves  Step 
2,  it  either  ends  up  in  Step  4,  5,  6,  8  or  9.  Steps  5  and  6  are  terminal  steps  as  shown 
above.  Further  we  have  shown  that  when  the  algorithm  leaves  Steps  4,  8  or  9  to  return  to 
Step  2,  it  does  so  with  an  integral  dual  feasible  function  which  forces  the  objective  value 
to  decrease  by  at  least  1.  Hence,  if  /*  is  the  optimal  dual  function,  then  in  no  more  than 
fo(b)  -  f*(b)  steps  the  algorithm  must  terminate.  If  /o  corresponds  to  the  dual  solution 
of  the  linear  programming  relaxation  of  (P),  then  fo(b)  =  [cx*J,  where  x*  is  the  optimal 
linear  programming  solution.  If  (P)  is  a  0-1  integer  programming  problem,  /0(6)  -  /*(&)  < 
!  Cj .  Thus  if  the  subproblems  can  be  solved  in  polynomial  (or  pseudopolynomial)  time, 
a  pseudopolynomial  time  algorithm  results. 

We  should  note  here  that  in  Step  7,  it  is  not  necessary  to  solve  (RMP)  to  optimality. 
It  is  merely  necessary  to  find  a  dual  solution  of  value  less  than  f(b)  (which  proves  that  the 
value  of  (RMP)  is  less  than  f{b)).  (Note  that  the  dual  of  any  linear  programming  solution 
of  (RMP)  will  have  value  at  most  f{b).)  This  fact  results  in  computational  savings  when 
Step  7  is  implemented  using  cutting  planes,  as  discussed  in  Section  3.2. 
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3  The  Black  Boxes 


The  major  problems  with  implementing  the  primal  dual  algorithm  of  Section  2  above  are  in 
Steps  2,  3,  7  and  8.  In  particular,  note  that  (RFP)  is  an  integer  programming  problem  that 
could  be  as  difficult  as  the  original  problem  (P).  Here,  we  discuss  two  approaches  to  dealing 
with  solving  (RFP)  and  implementing  the  black  boxes  of  Steps  3  and  8  of  the  algorithm. 

In  Section  3.1,  we  look  at  two  special  cases  of  (P)  where  the  structure  allows  for  an 
easy  solution  of  (RFP).  Then,  in  Section  3.2,  we  consider  cutting  planes.  Specifically  we 
specialize  the  results  of  Chvatal  ([5])  to  build  dual  superadditive  functions  using  Gomory 
cutting  planes  in  order  to  implement  Steps  2,  3,  7,  and  8  of  the  algorithm.  Cutting  planes 
can  always  be  used  to  solve  (RFP)  and  (RMP)  if  no  special  purpose  methods  are  available. 

Finally,  in  Section  3.3  we  show  how  our  framework  can  be  used  to  find  a  maximum  weight 
matching  in  a  graph.  In  this  case  the  cardinality  matching  algorithm  of  Edmonds  ([7])  is 
used  to  solve  (RFP). 

3.1  Easy  Special  Cases 

Here  we  briefly  discuss  two  very  special  cases  where  implementation  of  the  algorithm  is  easy. 
For  any  J  C  {1, . . .  ,n}  define  (PJ)  to  be  the  subproblem  of  (P)  using  only  the  columns  of 
A  in  the  set  J. 

First  consider  the  case  when  (P  J)  is  feasible  if  and  only  if  its  linear  programming  relax¬ 
ation  is  feasible  for  each  J.  In  this  case,  {x  6  Qn\Ax  =  b,x  >  0}  is  an  integral  polyhedron, 
and  we  will  show  that  our  algorithm  reduces  to  the  usual  primal  dual  algorithm  for  linear 
programming.  Since  (P  J)  is  feasible  if  and  only  if  its  linear  programming  relaxation  is  fea¬ 
sible,  W*  <  0  if  and  only  if  the  value  of  the  linear  programming  relaxation  of  (RFP)  is 
less  than  0.  If  so,  then  there  is  a  linear  function  /  (the  optimal  dual  solution  to  the  linear 
programming  relaxation  of  (RFP))  that  will  satisfy  the  conditions  of  Step  3  of  the  algorithm. 
In  general,  the  rounding  of  the  function  that  occurs  in  Step  4  of  the  algorithm  is  vital  to  the 
convergence  of  the  algorithm.  However  in  this  case,  because  the  linear  dual  solution  can  be 
used,  convergence  is  guaranteed  without  rounding.  As  with  the  usual  primal  dual  algorithm 
for  linear  programming,  we  can  assume  all  optimal  solutions  to  (RFP)  are  basic  solutions. 
Since  no  basis  is  ever  repeated,  the  algorithm  is  finite.  Thus,  with  the  rounding  step  omit¬ 
ted,  the  dual  function  will  remain  linear  throughout  the  execution  of  the  algorithm.  As  a 
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consequence,  complementary  linearity  will  always  be  satisfied  trivially,  (RMP)  need  never 
be  solved,  and  our  algorithm  reduces  to  the  primal  dual  algorithm  for  linear  programming. 

Next,  define  the  group  relaxation  of  (P),  called  here  (G): 

(G)  max  cx 

s.t.  Ax  =  b 

x  integer. 

As  our  second  easy  special  setting,  we  now  consider  the  case  when  for  any  column  set,  J, 
the  problem  (P  J)  is  feasible  if  and  only  if  both  its  linear  relaxation  and  group  relaxation 
are  feasible.  (A  simple  example  where  this  condition  holds  is  if  A  is  a  diagonal  matrix  with 
nonnegative  integers  along  its  diagonal.)  Now,  to  solve  (RFP)  one  must  first  check  if  the 
linear  relaxation  is  feasible.  If  not,  then  as  above,  the  linear  programming  dual  solution  will 
work  in  Step  3  of  the  algorithm  directly.  If  the  linear  relaxation  is  feasible,  then  next  check 
the  feasibility  of  the  group  relaxation.  If  the  group  relaxation  is  not  feasible,  by  the  theorem 
of  the  alternative  for  integral  systems  of  equations  (see  [8]  for  example),  it  is  known  that 

There  exists  y  such  that  ydj  6  Z  Vj  €  J, 
but  yb  &  Z. 

Moreover,  y  can  be  found  in  polynomial  time  by  a  unimodular  elimination  scheme  (see  [6]). 
Then,  le:  f(w)  —  [yw\  -  yw.  This  function  satisfies  the  property  required  by  Step  3  of 
the  algorithm.  If  both  the  linear  and  group  relaxations  are  feasible,  then  there  is  a  feasible 
solution  to  (RFP)  with  value  0,  and  this  solution  will  be  optimal  for  (P).  In  particular,  there 
will  be  an  optimal  solution  to  the  linear  programming  relaxation  of  (RFP)  that  is  integral. 
This  integral  solution  can  be  found  by  pivoting  among  the  alternate  optima  to  the  linear 
programming  relaxation  or  by  using  other  standard  techniques.  Note  that  such  an  integral 
solution  need  only  be  sought  in  the  final  iteration  of  the  algorithm. 

For  any  integer  programming  problem,  the  first  instance  of  (RFP)  to  arise  can  always 
be  easily  solved  in  the  absence  of  dual  degeneracy  in  the  linear  programming  relaxation. 
Usually  the  linear  programming  relaxation  of  (P)  is  solved  in  Step  1  of  the  algorithm  to 
find  /o;  and  then  (RFP)  is  solved.  In  the  absence  of  dual  degeneracy  the  set  J  is  exactly 
equal  to  the  indices  of  the  basic  variables  of  the  optimal  solution  of  the  linear  programming 
relaxation.  Hence,  unless  the  solution  of  the  linear  programming  problem  is  integral,  (and 
thus  the  optimal  integer  solution),  (RFP)  is  infeasible.  If  B  is  the  basis,  there  is  some  row 
of  B~l,  say  0,  such  that  0b  is  not  integer.  Since  for  every  j  €  J,  dj  is  a  column  of  B,  0dj  is 
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either  0  or  1.  Thus  the  function  /( w)  =  |_^wj  -  pvx  satisfies  the  property  required  by  Step 
3  of  the  algorithm. 

3.2  Cutting  Planes 

Both  (RFP)  and  the  integer  programming  restriction  of  (RMP)  can  always  be  solved  using 
cutting  planes.  In  [5]  Chvatal  has  shown  the  following  (although  using  different  terminology): 


Theorem  3.2.1  (Chvatal)  Consider  the  integer  programming  problem  max(cx|  Ax  <b,  x  integer} 
<  0k,  k  —  1, . . .  ,r,  is  a  series  of  cuts  constructed  by  Gomory’s  cutting  plane 
algorithm,  then  there  are  Chvatal  functions  Fk,  k  —  l,...,r,  such  that  Fk(aj )  =  ak  and 
Fk{b)  =  (3k,  for  all  k=  1, . . . ,  r. 

Theorem  3.2.1  can  easily  be  adapted  to  our  setting. 

Theorem  3.2.2  Consider  the  restricted  feasibility  problem  (RFP).  If'E.jtj  ykxf  < 

0k,  k  =  1, . . .  ,r,  is  a  series  of  cuts  constructed  by  Gomory’s  cutting  plane  algorithm,  then 
there  are  Chvatal  functions  Fk,  k  =  1, . . . ,  r,  such  that  Fk(aj)  =  off  for  allj  6  J,  Fk(ei )  =  7* 
and  Fk(b )  =  0k,  for  all  k  =  1,. . .  ,r. 

Similarly,  consider  the  restricted  maximization  problem  (RMP).  IfY.j€JakXj  <  0k,  k  = 

1, . . .  ,r,  is  a  series  of  cuts  constructed  by  Gomory’s  cutting  plane  algorithm,  then  there  are 
Chvatal  functions  Fk,  k  =  1, . . . ,  r,  such  that  Fk(aj )  =  ak  for  all  j  €  J  and  Fk(b)  =  0k,  for 
all  k  =  1, . . .  ,r. 

For  continuity  of  exposition,  the  details  of  the  construction  of  the  function  guaranteed  by 
Theorem  3.2.2  are  left  until  the  end  of  this  subsection.  We  now  show  how  such  functions  can 
be  used  to  solve  both  (RFP)  and  (RMP).  (See  also  [19]  for  results  relating  cutting  planes 
and  Chvatal  functions.) 

Recall  (RFP),  written  as  an  optimization  problem  with  artificial  variables, 
max  W  =  r.  ~  x* 

ial 

s.t.  =  bi,  for  *  =  1, . . . ,  m, 

i€J 

all  x  >  0  and  integer. 
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If  the  linear  programming  relaxation  of  (RFP)  has  an  optimal  solution  with  value  less 
than  0,  let  y  be  the  optimal  dual  solution.  Then  from  linear  programming  duality,  ya}  >  0 
for  all  j  6  J  and  yb  <  0.  Define  /( w)  =  yw. 

Otherwise,  the  optimal  value  of  the  linear  programming  relaxation  of  (RFP)  is  0.  In  this 
case  we  generate  cuts  until  either 

1.  An  integer  solution  is  obtained,  and  we  return  to  Step  2  of  the  algorithm  with  optimal 
solution  x*,  and  W *  =  0;  or 

2.  We  finally  get  a  solution  to  the  linear  programming  problem  with  added  cuts,  whose 
value  is  negative.  That  is,  the  following  linear  programming  problem  has  value  less 
than  0: 


max  r-xf 

i=i 

S.t.  +  Xi 

J€J 

£**(*,& +  £**(«)*  < 

j€J  «=1 

all  x  >  0 

where  the  functions  Fk,  for  k  =  1, . . .  ,r,  are  the  Chvatal  functions  corresponding  to 
the  added  cuts  as  in  Theorem  3.2.2.  Let  (y,/x)  be  the  optimal  dual  solution  to  this 
linear  programming  problem,  where  y  is  the  vector  of  dual  variables  corresponding 
to  the  original  constraints,  and  y,  is  the  vector  of  dual  variables  corresponding  to  the 
added  cutting  planes.  Note  that  since  the  cutting-plane  constraints  are  inequalities, 
we  must  have  /x  >  0.  By  linear  programming  duality,  yaj  +  £*=.1  HkFk(aj)  >  0  for  all 
j  6  J  and  yb  +  ££*1  /i*Ffe(6)  <  0.  Thus  we  can  let  f(w)  =  yw  +  £*-1  HkFk(w).  Since 
/x  >  0,  /  is  in  Cm  and  it  satisfies  the  required  conditions  of  Step  3  of  the  algorithm. 

The  integer  programming  restriction  of  Problem  (RMP)  can  be  handled  similarly,  with 
the  understanding  that  the  constraint  £j6j  /(a;)X;  <  /(b)  of  (RMP)  should  be  treated  just 
as  any  other  added  cut.  Also  recall  that  it  is  not  necessary  to  find  an  integer  optimal  solution 
of  (RMP),  only  to  find  an  /  with  f(a,j)  >  /(a/)  for  all  j  e  J ,  and  f(b)  <  f(b).  If  the  optimal 
solution  to  (RMP)  is  less  than  f(b),  then  let  y  be  the  optimal  dual  solution.  So,  yaj  >  /(a;) 
for  all  j  e  J  and  yb  <  /(&).  In  this  case  we  can  let  f(w)  =  yw.  Otherwise,  add  cuts  with 
corresponding  Chvatal  functions  Fl, . . . ,  F1",  until  either 


bi,  for  :  =  l,...,m 
Fk(b ),  forfc=l,...,r 
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1.  An  integer  optimum  is  obtained  with  value  /  (6).  In  this  case  the  solution  is  optimal, 
and  we  return  from  (RMP)  with  x*\  or 

2.  An  optimal  linear  programming  solution  is  reached  (integer  or  fractional)  with  value 
less  than  f(b).  As  with  (RFP),  we  let  (y,  y.)  be  the  optimal  dual  solution  and  let 
/( w)  =  yw  4-  T.k=i  A ikFk(w).  By  linear  programming  duality  we  have  that  y  >  0,  so  / 
is  a  Chvatal  function,  and  that  f(a.j)  >  f(dj)  for  all  j  e  J  and  f(b)  <  f(b)  so  that  the 
conditions  of  Step  8  of  the  algorithm  are  satisfied. 

We  now  give  the  details  of  the  construction  of  the  function  F  corresponding  to  a  cut 
as  in  Theorem  3.2.2.  Although  these  can  be  derived  from  [5]  we  have  included  them  for 
completeness.  We  will  asume  that  we  are  working  on  (RFP).  The  adaptation  for  (RMP)  is 
straightforward:  the  same  procedure  is  followed  with  the  omission  of  the  artificial  variables. 
Suppose  that  we  have  already  added  r  cuts,  so  that  the  system  we  currently  have  is 

=  bit  for  z  =  1, ... ,rn 

j£J  m 

E^Jx, +  £/*(«)*?  <  Fk(b ),  for  fc  =  1, . . .  ,r 

j€J  i=l 

all  x  >  0  and  integer. 

Let  the  slack  variables  which  have  been  added  to  the  tableau  for  the  rows  corresponding 
to  the  cuts  be  si,---,sr.  We  can  suppose  without  loss  of  generality  that  the  F’s  being 
constructed  are  always  rounded  down,  and  thus  always  have  integer  values.  Suppose  that 
we  want  to  cut  on  row  i  of  the  current  tableau,  and  that  row  i  has  the  form: 

E  TjjXj  +  aiX?  +  •  •  •  +  QwC  +  7i$i  +  •  •  •  +  lrSr  =  6  (1) 

j€J 

where  6  is  fractional.  The  cut  we  wish  to  generate  is 

E  LtyJ xi  +  LalJ*l  +  •  ■  •  +  L^mJXm  +  l7ljsi  +  •  •  •  +  |>Jsr  <  [Al  •  (2) 

j€J 

Note  that  in  the  usual  statement  of  Gomory’s  algorithm,  the  cut  used  is  (2)  -  (1),  instead  of 
(2).  However  adding  (2)  alone  has  the  same  effect,  since  (1)  is  an  equality  and  remains  part 
of  the  problem.  Solving  for  the  slack  sj  in  (1),  and  substituting  in  (2)  gives 

E  f  t^J  -  E  \rrk\Fk{aj)\  X;  +  E  ( IttiJ  -  E  b'fcJF'fc(e,)'j  x“ 

j€J  \  1  /  1=1  \  fc= 1  / 
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(3) 


<  L«5J  -  E  MFk(b). 

k^l 

Let  the  current  basis  be  B ,  and  let  the  ith  row  of  B~l  be  B~l  =  (j3u  ■  ■  ■  ,/?m,7i,  •  •  •  ,7r). 
Note  that  the  coefficients  of  the  slacks  in  (1)  will  always  occur  in  the  ith  row  of  B~l,  since 
there  were  originally  unit  vectors  in  these  columns.  We  thus  have  that 


/  *1;  \ 

a.2j 


Vj  =  S' 


-1 


+mj 


F‘(<b) 


\  r(ai)  ) 


Similarly 


and 


=  0m)dj  +  5^7 kFk(aj),  j  = 

fc=i 


,Pm)ei  +  Y./yi‘Fk(e*)>  *  = 

jfc=i 


S  =  (.0l,--,l3m)b+'£itFk(b). 

k=l 


We  now  define, 


(4) 

(5) 

(6) 


F{w)  =  [(/3i,---,/?m)w  +  E£=i(7fc  -  L7*J)**(w)j-  (7) 

Since  7 *  —  [7*:J  >  0  always,  F  is  a  Chvatal  function.  Also,  since  Fk( w)  is  always  forced  to 
be  integer,  we  can  take  l7*J-f’fc(w)  outside  the  round-down  sign  and  write: 

F(w)  =  [(A,  •  •  -,Pm)w  +  EUi  7fcF*(w)j  -  £  l'rk\Fk(w).  (8) 

Using  (4),  (5),  (6)  and  (8),  it  is  easy  to  verify  that  (7)  generates  the  cut  (3). 

Notes:  If  cutting  planes  are  being  used  to  solve  the  “black  boxes,”  cuts  generated  in  (RFP) 
should  not  be  discarded  (except  for  terms  corresponding  to  the  artificial  variables)  when 
proceeding  to  (RMP).  They  will  remain  valid. 

Moreover,  when  solving  (RMP),  it  is  not  necessary  to  continue  to  add  cuts  until  the 
optimal  integer  solution  is  found.  It  suffices  to  add  cuts  only  until  the  optimal  fractional 
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value  falls  below  f(b).  Then  the  dual  function  /  can  be  constructed  as  above,  and  it  will 
have  value  less  than  f(b)  as  required  by  Step  8  of  the  algorithm. 

It  is  important  to  note  that  if  one  is  solving  a  problem  for  which  some  or  all  of  the  facet 
defining  inequalities  (or  some  other  “deep”  cuts)  are  known,  then  these  can  be  used  in  place 
of  the  Gomory  cuts  since  it  has  been  proven  (see  [4],  [13]  and  [18])  that  every  valid  inequality 
of  (P)  is  equivalent  to  or  dominated  by  a  an  inequality  generated  by  a  Chvatal  function.  Of 
course,  one  must  still  construct  this  function  from  the  facet  defining  inequality. 

3.3  Matching 

The  primal  dual  algorithm  of  Section  2  can  be  specialized  to  find  a  maximum  weight  matching 
in  a  weighted  graph  G.  We  have  chosen  this  example  because  (RFP)  can  be  solved  efficiently. 
When  it  becomes  necessary  to  solve  (RMP),  deep  cuts  can  be  generated  using  a  separation 
procedure  due  to  Padberg  and  Rao  ([17]).  This  approach  is  not  likely  to  be  more  efficient 
than  specialized  weighted  matching  algorithms,  but  provides  a  nice  illustration  of  how  the 
primal  dual  algorithm  can  be  tailored  to  a  specific  problem. 

Without  loss  of  generality  suppose  that  any  maximum  weight  matching  is  a  perfect 
matching  ( G  is  easily  altered  so  that  this  is  the  case).  Then,  if  A  is  the  node-edge  incidence 
matrix  of  G,  b  is  the  vector  of  all  l’s,  and  c  is  the  vector  of  edge  weights,  the  problem  of 
finding  a  maximum  weight  matching  is  exactly  the  problem  (P)  of  Section  1.  Let  the  node 
set  of  G  be  V. 

Definition:  Given  a  graph  G,  an  odd  set  cover  of  G  is  a  set  of  node  sets  N\, . . . ,  .V,  each 
having  odd  cardinality  greater  than  1 ,  and  a  set  of  singletons  V\ , . . . ,  vr  such  that  every  edge 
of  G  either  has  both  endpoints  contained  in  some  iVfc  or  is  incident  to  some  vr  The  capacity 
of  the  cover  is  equal  to  r  +  [^J . 

Edmonds  (see  [7])  showed  that  the  following  duality  holds:  the  maximum  cardinality  of 
a  matching  in  a  graph  is  equal  to  the  minimum  capacity  of  an  odd  set  cover.  The  algorithm 
given  in  [7]  returns  both  a  maximum  cardinality  matching  and  the  corresponding  odd  set 
cover.  We  will  use  this  matching  duality  result  here,  and  assume  that  we  have  available  an 
algorithm  that  finds  a  maximum  cardinality  matching  which  also  returns  an  odd  set  cover 
whose  capacity  is  equal  to  the  cardinality  of  the  m  tching. 

Problem  (RFP)  for  the  matching  problem  is  efficiently  solved.  Given  J ,  we  must  de¬ 
termine  if  (P)  is  feasible  using  only  the  variables  in  J.  In  the  matching  setting,  we  must 
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determine  if  there  is  a  perfect  matching  in  G  using  only  the  edges  indexed  by  J.  We  begin  by 
finding  the  maximum  cardinality  matching  on  the  graph  whose  edges  consist  only  of  those 
indexed  by  J.  If  this  is  a  perfect  matching  of  G,  then  we  have  a  feasible  solution  of  (P) 
satisfying  complementary  slackness  and  we  proceed  to  Step  6  or  7  of  the  algorithm.  Oth¬ 
erwise,  we  have  a  matching  of  size  p ,  where  2 p  is  less  than  the  number  of  nodes  in  G.  The 
cardinality  matching  algorithm  will  also  have  returned  an  odd  set  cover  of  the  edges  indexed 
by  J  with  capacity  p.  Let  the  odd  set  cover  be  {Ni,  jV2)  •  .  ,Na,vi, . . .  ,vr}.  Without  loss  of 
generality  assume  that  the  vertices  Vi , . . . ,  vr  correspond  to  the  first  r  rows  of  A. 

Theorem  3.3.1  The  function 

/( uO  =  £>j  +  £lZfj-5£>‘ 

satisfies  the  condition  of  Step  3  in  the  algorithm. 

Proof:  We  must  show  that  f{af)  >  0  for  each  j  £  J  and  that  f(b)  <  0.  Recalling  that  b  is 
the  vector  of  all  l’s,  the  negative  term  in  f(b)  is  —  The  positive  term  will  be  equal  to 
r  +  the  capacity  of  the  cover  returned  by  the  cardinality  matching  algorithm. 

Since  the  capacity  of  the  cover  is  equal  to  the  cardinality  of  the  matching  found,  and  the 
matching  was  not  perfect,  f(b)  <  0. 

Now  let  e  be  an  edge  indexed  by  j  e  J.  Then  a;  has  a  1  in  the  positions  corresponding 
to  the  two  endpoints  of  e  and  0’s  elsewhere.  Since  e  is  indexed  by  a  member  of  J,  it  is 
covered  by  the  odd  set  cover  given  by  the  cardinality  matching  algorithm.  If  e  is  incident 
with  one  of  the  vfs  then  /(aj)  >1  —  1  =  0.  If  e  has  both  its  endpoints  contained  in  Nk, 
then  [£,€*„  =  1  and  f{af)  >  1  -  1  =  0.  □ 

Thus  (RFP)  can  be  completely  solved  through  the  use  of  the  cardinality  matching  algo¬ 
rithm.  We  now  consider  (RMP).  When  we  reach  step  7  of  the  algorithm  we  have  found  a 
perfect  matching  using  only  edges  in  J,  but  the  value  of  that  perfect  matching  is  less  than 

m. 

When  solving  (RMP),  it  is  necessary  to  find  an  integer  solution  of  value  f(b)  or  a  dual 
solution  /'  with  f'{b)  <  f(b).  As  discussed  above,  this  can  be  accomplished  by  adding 
cutting  planes  until  the  value  of  (RMP)  first  drops  below  f(b).  (Recall  that  (RMP)  will 
always  have  value  at  most  f(b).)  In  the  matching  setting  we  have  the  advantage  that  we 
know  what  the  facets  of  the  corresponding  polytope  are. 


17 


Let  5  be  a  set  of  nodes  in  the  graph,  and  let  E(S)  denote  the  set  of  edges  whose  both 
endpoints  are  in  S.  Then  every  facet-defining  inequality  is  of  the  form 

E  »i  s  ^ 

J€£(S)  z 

where  5  is  a  node  set  of  odd  cardinality.  It  easy  to  see  that  the  function 


m  =  le  f  J 

»€S  Z 

defines  the  facet  derived  from  an  odd  set  S. 

Now  suppose  that  we  have  solved  the  linear  programming  relaxation  of  (RMP),  and 
obtained  a  basic  fractional  solution  with  value  equal  to  /(&).  The  fractional  solution  will 
violate  one  of  the  above  facet  describing  inequalities.  Padberg  and  Rao  [17]  showed  that  the 
separation  problem  for  the  matching  polytope  can  be  solved  in  polynomial  time.  That  is,  a 
facet-defining  inequality  that  is  violated  by  a  given  infeasible  solution  can  be  determined  in 
polynomial  time.  Thus  in  polynomial  time  we  can  generate  a  cut  as  described  above  that 
will  reduce  the  value  of  (RMP)  as  desired,  or  find  an  integer  solution  with  value  equal  to 

m. 


4  Future  Work 

Future  work  related  to  this  algorithm  should  be  directed  towards  finding  special  structures 
which  allow  efficient  solution  of  the  restricted  subproblems.  More  specifically,  one  aim  is  to 
identify  cases  where  the  problem  can  be  solved  without  resorting  to  the  use  of  cutting  planes 
in  the  solution  of  (RFP).  These  problems  can  be  of  two  types  -  those  where  the  type  of  data 
is  known  to  be  such  that  the  subproblems  can  be  solved  easily  (like  those  cases  discussed  in 
Section  3.1  for  example),  and  those  where  the  problem  structure  provides  other  avenues  for 
solution  (as  in  Section  3.3).  With  respect  to  this  last  type  of  problem,  the  richest  potential 
lies  with  integer  programming  problems  with  known  pseudopolynomial  algorithms.  It  would, 
of  course,  be  most  rewarding  to  use  our  framework  to  establish  pseudopolynomial  algorithms 
for  classes  of  integer  programs  for  which  there  do  not  yet  exist  “efficient”  solution  procedures. 
One  other  direction  for  future  work  is  in  the  area  of  column  generation.  In  problems  with 
exponentially  many  columns,  it  is  expected  that  our  procedure  will  not  only  keep  the  number 
of  active  columns  small  but  also  will  direct  the  user  to  a  “smart”  set  of  such  columns. 
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